Term
|
Definition
The path printed out by the pwd function. The path to that directory, starting from the root directory. Starts with a / character. |
|
|
Term
|
Definition
A unique ID for searching for nucleotides in Genbank. |
|
|
Term
|
Definition
When you have a read, you must align it with a sequence in the reference genome. This is what alignment software does: it locates the read within the 3 billion bp genome. Allows you to identify nucleotide differences. Includes Bowtie, BWA, and STAR. The user can set criteria for alignements, allowing a certain number of gaps and mismatches between sequences and the reference genome. The more differences allowed, the more false positives and true positives. Output is in the .SAM format. Outcomes of a read are:
1. The read does not match anywhere in the genome. It may be because the reference genome is incomplete, or the read sequence has diverged from the reference genome (indels and SNPs). The software determines how much difference is accepted for a match.
2. The read matches to one site very well.
3. The read matches up equally well to more than one site. This is caused by genome repetition.
4. Paired ends match up at different sites: different fragments of the read match to separate sites, even on separate chromosomes. Caused by DNA fragmenting and moving around of the reference genome. |
|
|
Term
|
Definition
Differences in genes within protein-coding sequences, or within genes that regulate how genes are expressed. Contributes to genetic variation. |
|
|
Term
|
Definition
The gene which controls beak shape in Darwin's finches. Involved in craniofacial development. A homeobox. |
|
|
Term
|
Definition
The trait from which the derived trait evolved; it is found in the common ancestor. In Darwin's finches, pointed beaks are the ancestral trait. |
|
|
Term
|
Definition
A gene at position 5.2 million bp in the chicken genome. Contains a SNP in close linkage with the yellow skinned trait. |
|
|
Term
|
Definition
|
|
Term
|
Definition
Stores zero or more scalars as elements which can be accessed with an integer index. You can add, remove, or replace elements or sets of elements. Denoted with the @ sigil. |
|
|
Term
|
Definition
A rare inherited condition, with which the case study GIT 264-1 was diagnosed. Caused by a defect in the kidney's ability to reabsorb sodium. The kidneys remove too much potassium from the body. |
|
|
Term
|
Definition
In exome sequencing, the proportion of DNA sequenced which belongs to the exome. While only exome genes are found on the chip, hybridization is not perfect, and there are hanging fragments which get sequenced as well. Ideally it is 100%. In the GIT 264-1 sequencing, bases mapped to exome was 34.8%. |
|
|
Term
|
Definition
The gene for yellow skinned chickens is suspected to be these gene. Encodes betacarotene dioxygenase 2, which cleaves colourful carotenoids into colourless apocarotenoids. A SNP is found in chicken breeds with yellow skin, and not in chicken breeds with white skin; the allele is fixed. When the gene is sequenced in white- and yellow-skinned chicken breeds, they did not find nucleotide changes which would be expected to change protein function, rather they differed at nucleotides important for transcriptional regulation; the yellow allele must regulate BCDO2 differently than the white allele. The mRNA is weakly expressed in yellow skin chickens relative to white skin chickens, but only in skin tissues. |
|
|
Term
|
Definition
Analysis of molecular data. Uses existing techniques to make sense of information in DNA sequences. |
|
|
Term
|
Definition
A common disease with environmental and genetic factors that contribute to it. |
|
|
Term
|
Definition
http://bowtie-bio.sourceforge.net/index.shtml
An alignment software. Outputs some optional fields for each alignment in the .SAM format, depending on the type of alignment, including edit distance. |
|
|
Term
|
Definition
http://bio-bwa.sourceforge.net/
An alignment software. Used in the study with Darwin's finches. |
|
|
Term
|
Definition
Traits must have two possibilities. Tests for an association between a trait and DNA sequences. Uses a contingency table. Gives a p value. The greater the deviation from the expected, given the null hypothesis, the more evidence for association. The Χ2 statistic is larger the stronger the association.
Χ2 = Σ ((observed - expected)2 / expected) |
|
|
Term
|
Definition
A roundworm. Its genome is 97 Mbp. |
|
|
Term
|
Definition
Gallus gallus
Its ancestor is jungle fowl. It has 78 chromosomes, all of which may have nucleotide differences with each other. Its genome is 1,200 Mbp. |
|
|
Term
Chromosomal rearrangement |
|
Definition
Includes inversions and translocations. FISH staining can reveal them. They are fairly common. More important for creating variation. Not used to assay DNA polymorphisms. Important for fertility; they can cause unviable versions of genes. |
|
|
Term
|
Definition
The fifth information point given in the .SAM format. Represents how well the query matches the reference. String "3S97M" indicates that 97 bases matched, but 3 were sliced from the end. |
|
|
Term
Coding single nucleotide variant (cSNV) |
|
Definition
Single nucleotide differences within coding sequences. |
|
|
Term
|
Definition
The higer the score, the rarer a SNP is at a certain point, and the more conserved the gene. High scores are found in TFB sites. |
|
|
Term
|
Definition
Used for doing a chi square test. A table showing the frequencies of combinations of traits and DNA sequences. |
|
|
Term
|
Definition
|
|
Term
|
Definition
A common disease with environmental and genetic factors that contribute to it. |
|
|
Term
|
Definition
The value which is chosen as the significance threshold. Any p value below this value, and the null hypothesis is rejected. Typically it is 0.05, but can be larger in bioinformatics because of high sampling numbers. |
|
|
Term
|
Definition
A common disease with environmental and genetic factors that contribute to it. |
|
|
Term
|
Definition
Geospiza sp.
Famous because they are mentioned in Origin of the Species. Have been widely studied as a model of speciation and adaptive evolution. Illustrate young, adaptive radiation. One trait that has diverged is beak shape. Different beak shapes enable the birds to eat different foods. Lamichhaney et al (2015) discovered a likely gene contributing to beak evolution. There were 100 bp reads made, with 10x coverage, for 200 individuals. The reads were aligned using BWA alignment software, and GATK was used to identify SNPs and indels. Birds with blunt beaks should have different alleles than those with pointed beaks at alleles that contribute to beak shape; allele frequencies should differ at these loci. Blunt beak was the derived trait, and had little genetic diversity. The gene controlling beak shape was found to be ALX homeobox 1. |
|
|
Term
|
Definition
A repository of information that enables entering and extracting data. Usually consists of tables containing records and fields. A simple database would be a single page with information about students in a class. Usually have a carefully controlled format and restricted vocabulary, so computers can retrieve information easily by searching across fields. |
|
|
Term
|
Definition
The number of times a location within an individual is sequenced. |
|
|
Term
|
Definition
Novel trait
The trait which evolved from the acnestral trait. Loci for derived traits have less genetic diversity. In Darwin's finches, blunt beaks are the derived trait. |
|
|
Term
|
Definition
It consists of adenine (A), guanine (G), cytosine (C), and thymine (T). It is double-stranded. Written from 5' end to 3' end. Genes and other attributes can be encoded on either strand. |
|
|
Term
|
Definition
A use of sequencing technology. ChIP sequencing is used. Finds where genes for proteins are located. |
|
|
Term
|
Definition
A bacteria. Its genome is 5 Mbp. |
|
|
Term
|
Definition
An optional field in the .SAM format output of Bowtie. It would take two changes to make the query sequence the same as the reference. |
|
|
Term
|
Definition
Contributes to trait variation. Can include diet. Includes things which we don't understand. |
|
|
Term
|
Definition
Variation due to environmental causes. Can be due to known or unknown causes. |
|
|
Term
|
Definition
Changes in chromatin and histones that can affect phenotype. Involves epimutations and epialleles. The DNA sequence is the same, but with chemical differences. Contributes to trait variation. |
|
|
Term
|
Definition
A function of allele frequencies between populations. Based on the frequency of heterozygotes expected, given a whole, random mating population, compared to the frequency of heterozygotes observed within subpopulations. If there is low diversity within a population, one expects a low frequency of heterozygotes. A value of 0 indicates no allelic differentiation between populations. A value of 1 indicates maximum allelic differentiation betwen populations.
FST = (HT - HS) / HT |
|
|
Term
|
Definition
A genetic variant that appears to contribute to a trait, but does not. If controls have different ancestry than diseased individuals, then regions of the genome where the two populations differ can associate with disease. This can arise if one population is more susceptible to disease than another. |
|
|
Term
|
Definition
Using more than one site on a chromosome to map the order of genes and their distances from each other. The objective is to identify the location on a chromosome of the sequence that controls a trait. One can map the DNA sequence contributing to trait variation to between two chromosomal positions on the physical map. The mapping can be very precise, down to an individual gene or to a chromosomal region. Crossovers are used to identify gene locations. There may be a weak correlation between genetic and physical distance, so all mapping is approximate. |
|
|
Term
|
Definition
The sum of applicable flags is the second information ponit given in the .SAM format. Gives information about the mate pair mapping, among other things. |
|
|
Term
|
Definition
|
|
Term
|
Definition
A gene where a mutation at site 90 provides some protection against malaria. It has carried closely linked alleles as it increased in frequency in the population. When homozygous, this mutation causes sickle cell anemia. |
|
|
Term
|
Definition
An annotated collection of all publicly available DNA sequences, and other sources. You can search the nucleotide database with accession numbers. Output is in Genbank format. |
|
|
Term
|
Definition
The output of searching Genbank with an accession number. Contains a summary of information about the sequence, sequence attributes, and the sequence itself. The page is populated by data extracted from the database. |
|
|
Term
|
Definition
https://genome.ucsc.edu/index.html
A genome data browser for many organisms, including chickens is available. Allows you to investigate a region of a chromosome, and shows gene sequence and annotation with genes and protein coding sites. Genome sequence projects produce the DNA sequence of chromosomes, as completely as possible. |
|
|
Term
|
Definition
Mutations that cause disease are either a stop codon, amino acid change, or alteration to splice sites. |
|
|
Term
|
Definition
The distance between two genes can be measured in cM using this formula:
Distance = (# recombinant types / # gametes) x 100 |
|
|
Term
|
Definition
If the genome is sequenced, SNPs can be located in the sequence. |
|
|
Term
|
Definition
Trait variation that is due to DNA sequence differences or epigenetic differences. May be due to nucleotide differences of major genes, or due to nucleotide differences at many minor genes. May refer to differences between DNA sequences, produced by mutations. Includes allelic differences. Cause gene functional differences through changes to the gene's coding sequences or changes to the gene's regulatory sequences. |
|
|
Term
|
Definition
DNA is housed in the nucleus, in chromosomes. There is also cytoplasmic DNA in michondria and plastids. In humans, there are 3 billion base pairs in the whole genome, with two copies for a total of 6 billion base pairs! This amount of data is equal to a stack of printed paper 1,524 m tall! |
|
|
Term
Genome analysis toolkit (GATK) |
|
Definition
SNP and indel identification software. Used in the study on Darwin's finches. |
|
|
Term
Genome-wide association (gwas) studies |
|
Definition
One samples an existing population that is the product of an unknown/complex pedigree to test for associations between nucleotide and trait variation. |
|
|
Term
Genome-wide association of 14,000 cases of seven common diseases and 3,00 shared controls |
|
Definition
A study on common illnesses with environmental factors and polygenic allele distribution. The final results were in the form of a case matrix and control matrix. Looking for alleles which were present at different frequencies in cases compared to controls. 500,000 loci were sampled; it is likely that some SNP positions are in linkage disequilibrium with neighboring genes that could affect disease. Diseases analyzed include bipolar disorder, coronary artery disease, Crohn's disease, hypertension, rheumatic arthritis, type I diametes, and type II diabetes. |
|
|
Term
|
Definition
A case study of a Turkish male who at 5 months was evaluated for failure to thrive, dehydration, and diarrhea. Had a premature birth at 30 weeks, and parental consanguinity, with two spontaneous abortions and death of a premature sibling on day 4. Diagnosed with Bartter syndrome. There were 20 homozygous deletions, all in the Database of Genome Variants, but none altering protein coding sequences or known to associate with Bartter syndrome. Performed a whole exome sequencing. Found 2,495 genes homozygous by consanguineous descent, and 1,493 nucleotide variants in coding regions, including 10 in highly conserved proteins, including SLC26A3. They first found SNPs using a pre-manufactured chip, but found no unusual SNPs in important regions. Then they sequenced his exome using massively parallel DNA sequencing, 5 billion bp, with a 100x coverage. Huge amounts of sequences were acquired, and a series of filters were used to diagnose him with a mutation in SLC26A3, causing chloride-losing diarrhea. |
|
|
Term
|
Definition
An important aspect of genetic variation analysis. You cannot make a blanket statement about a population based on one group. Genetic variation observed is a function of the group being investigated. |
|
|
Term
|
Definition
The proportion of heterozygous individuals expected from two populationis, given each is a separate random mating population. |
|
|
Term
|
Definition
The proportion of heterozygous individuals expected from two populations if they form one, random mating population. |
|
|
Term
|
Definition
Individuals with the same combination of alleles. Include AB, Ab, aB, and ab. |
|
|
Term
|
Definition
A data structure like an array, but instead of associating a scalar with an index position, it associates a scalar with a string key. The key-value pairs aren't arrayed in the same order they were put in. Denoted with the % sigil. |
|
|
Term
|
Definition
[individuals heterozygous for risk allele] / [individuals heterozygous for control allele] |
|
|
Term
|
Definition
A group of transcription factors that determine in what cells genes are expressed. |
|
|
Term
|
Definition
Every read, from both chromosomes, has a SNP compared to a reference genome. For case GIT 264, there were 9,045, and 117 of them were novel, never seen before. The allele causing disease must be homozygous; if both parents are unaffected by the disease they must be heterozygotes. |
|
|
Term
|
Definition
Genome is 3,000 Mbp. A gene is about 10,000 bp. There are 30,000 genes. About 300 Mbp of coding sequences. |
|
|
Term
|
Definition
Suggested that four species of jungle fowl contributed to the domestic chicken. Yellow skin alleles for BCDO2 seem to come from grey jungle fowl. Grey jungle fowl must have interbred with early domestic chickens. |
|
|
Term
|
Definition
A common disease with environmental and genetic factors that contribute to it. |
|
|
Term
|
Definition
Two individuals which have the same allele which arose from a common ancestor. |
|
|
Term
|
Definition
Two individuals which have the same alleles, but they did not arise from a common ancestor. |
|
|
Term
|
Definition
A sequencing technology. The manufacturer includes in the output an estimate of the confidence that the sequence is correct. The raw data output is as follows:
1. Line 1 begins with an '@' character and sequence identifier.
2. Line 2 contains the nucleotide sequence. This is the read.
3. Line 3 begins with a '+' character and may be followed by the sequence identifier.
4. Line 4 encodes the quality values for the sequence in line 2. The quality symbols associated with each nucleotide is asci coded, and can be translated into Q scores. |
|
|
Term
|
Definition
Insertion/deletion event
Repeat number differences. A site in the DNA where there is a sequence present in one individual and absent in another. Can be very large, up to megabases, or they can be a few nucleotides. |
|
|
Term
|
Definition
The natural ancestor of chickens, as thought by Darwin. They all have white skin. Hutt suggested that grey jungle fowl, which have yellow legs, may be the source of the genes for yellow skinned chickens. Found in South Asia. There are four species: red, grey, Ceylon, and green. |
|
|
Term
|
Definition
The part of an operating system that turns program instructions into commands the hardware understands. |
|
|
Term
Linkage disequilibrium (D) |
|
Definition
The non-random association of alleles at two or more chromosomal sites within a population. Alleles that are very close together on a chromosome, and they are inherited together. Calculated using haplotype frequencies. D is not meaningful by itself, and is often expressed relative to the maximum possible D given haplotype frequencies. Haplotype frequencies can never be less than zero, so D can never be greater than (FA x Fb) or (Fa x FB), so the lesser of these values will give you the maximum value of D.
FAB = (FA + FB) + D
FAb = (FA + Fb) - D
FaB = (Fa + FB) - D
Fab = (Fa + Fb) + D |
|
|
Term
Major histocompatibility complex (MHC) |
|
Definition
Has strong geographical variation, mostly in the NW/SE axis. |
|
|
Term
|
Definition
A way to display tests from multiple sites. The x axis gives the position of tested SNP. The y axis gives the probability of obtaining a test statistic given that the null hypothesis (SNP and trait are not associated) is true. It gets its name because the greaph looks like tall buildings of a cityscape. |
|
|
Term
Massively parallel sequencing |
|
Definition
A lot of molecules are sequenced. A chip has a collection of DNA sequences, each with 150 bp in paired ends. |
|
|
Term
|
Definition
The average number of reads per each single nucleotide. In the GIT 264-1 sequencing, the mean base coverage was 40.1. |
|
|
Term
|
Definition
Diseases caused by Mendelian genes. Relatively rare; there are 2,600. Approximately 85% affect protein coding regions or mRNA splice sites. |
|
|
Term
|
Definition
Mendelian genes
Have a large effect on phenotype. Includes the gene controlling wrinkled vs. smooth peas in Mendel's experiments. Not typically the situation; it is rare. |
|
|
Term
|
Definition
A use of sequencing technology. Gives information on gene function. DNA is treated with a chemical which digests methylated DNA, but not unmethylated DNA. Tells you what sequences are methylated and which aren't. |
|
|
Term
|
Definition
A type of mutation. In the GIT 264-1 sequencing, there were 5,091, and 357 of these were novel. These are the most important types of mutations. |
|
|
Term
|
Definition
|
|
Term
|
Definition
When multiple samples are sequenced in the machine at once. Often done to to reduce cost and data size. |
|
|
Term
National Centre for Biotechnology Information (NCBI) |
|
Definition
www.ncbi.nlm.gov
A main source for molecular information. Has many databases that contain nucleotide/protein information and relevant literature. One main database is "nucleotide"; a collectioin of sequences from Genbank. It is a little confusing compared to other databases. A record consists of a feature table, summary, and sequence. |
|
|
Term
Next-generation sequencing |
|
Definition
DNA is hybridized onto a pre-manufactured chip. Each "read" is 75 bp, but lots of reads are made, some of which overlap with each other in the DNA sequence. |
|
|
Term
|
Definition
A mutation which changes amino acid sequence. There were 5,091 in the GIT 264-1 sequencing. |
|
|
Term
|
Definition
Has 20 - 40 thousand genes. For a diploid organism, there is twice this amount in a somatic cell. |
|
|
Term
|
Definition
The further the value is from 1, the greater the association of allele differences to disease status. The value may be high, but the chances of being a case could still be low.
Odds ratio = [probability of being a case, given a genotype] / [probability of being a case given another genotype] |
|
|
Term
|
Definition
A platform that consists of a specific set of libraries and infrastructure for applications to be built upon and interact with each other. A software package that provides a desktop, shortcuts to applications, a web browser, and media play. Includes Microsoft Windows, Mac OSX, and Ubuntu. |
|
|
Term
|
Definition
A book by Charles Darwin. It first talks about how people can change traits in animals through breeding, and then goes on to conclude that selectioin must occur in nature as well. Uses Darwin's finches as an example of this. |
|
|
Term
|
Definition
Given two independent traits, the probability of getting results like the results observed, or worse, given the null hypothesis is true (no association between alleles). It is the area under the curve on a graph of frequency and Χ2 value. A measure of the probability of your data. |
|
|
Term
|
Definition
A directory that contains another directory. |
|
|
Term
|
Definition
Genotypes which are identical to one of the parents. If there is an association between the two genes, they will be frequent. |
|
|
Term
|
Definition
A programming language often used in bioinformatics to manipulate text based genetic data. |
|
|
Term
|
Definition
Physical locations of SNPs. You can get ever more precise locations, mapping SNPs onto the genome. |
|
|
Term
|
Definition
|
|
Term
|
Definition
There is no ethical concern involved with collecting DNA, unlike in humans. |
|
|
Term
|
Definition
Have dozens of genes. Mitochondria in plants have 100 - 1,000 kb. Mitochodnria in animals have 16 kb. Chloroplasts have around 120 - 160 kb. |
|
|
Term
|
Definition
The number of sets of chromosomes in a cell. Humans have two sets. |
|
|
Term
|
Definition
Alleles are not equally distributed across different subpopulations. Can arise from non-random mating or natural selection. |
|
|
Term
|
Definition
A type of mutation where a stop codon is inserted, truncating proteins. In the GIT 264-1 sequencing, there were 33. |
|
|
Term
|
Definition
Encoded in asci symbols in line 4 of Illumina FASTQ outputs. It is based on the p value.
Q = - 10 log10 [probability that base pair is wrong] |
|
|
Term
|
Definition
Q-Q plot
Plots the theoretical or expected value of a statistic given the null hypothesis on the x axis. Plotted from small to large, with corresponding observed values on the y axis. |
|
|
Term
|
Definition
Quantitative genes
Have a small effect on phenotype. Many genes contribute to production of phenotype. More complex than Mendelian loci. They are the typical situation. |
|
|
Term
|
Definition
Measures association of alleles. It is calculated similarly to D. |
|
|
Term
|
Definition
Genotypes which are recombinant versions of the parents' genotypes due to gene crossovers. If there is an association between the two genes, they will be rare. It is possible for there to be two crossover events, so a recombinant type resembles a parental type, but this is rare. |
|
|
Term
|
Definition
A path that starts from the current directory. Does not start with a /. |
|
|
Term
|
Definition
A ratio of estimated probabilities. The value can be high but have little real-world relevance.
Relative risk = [probability, given A] / [probability, given a] |
|
|
Term
|
Definition
A common disease with environmental and genetic factors that contribute to it. |
|
|
Term
|
Definition
|
|
Term
|
Definition
The allele which is more common in cases than in controls. |
|
|
Term
|
Definition
First directory
Denoted by a / character alone. |
|
|
Term
|
Definition
A yeast. Its genome is 12 Mbp. |
|
|
Term
|
Definition
The output of alignment software for sequence alignment. Each aligned sequence gets a single line of output. Information lcoated on the line:
1. Name of read.
2. Sum of all applicable flags.
3. Name of sequence where alignment occurs, indicating chromosome.
4. 1-based offset into the forward reference strand, where the leftmost character of alignment occurs.
5. Mapping quality.
6. CIGAR string.
7. Name of reference sequence where mate's alignment occurs.
8. 1-based offset into the forward reference strand whre the leftmost character of the mate's alignment occurs. Equal to 0 if there is no mate.
9. Inferred insert size. Size is negative if the mate's alignment occurs upstream of this alignment. Size is 0 if there is no mate.
10. Read sequence, reverse complemented if aligned to the reverse stand.
11. ASCII-encoded read quality. Encoded using Phred quality scale, offset by 33, similar to in FASTQ files.
12. Optional fields, often tab-separated. Includes edit distance in Bowtie. |
|
|
Term
|
Definition
A variable used to hold a single value such as a string, an integer, or a floating point (number with a decimal). You can use them as the number or string. Denoted with the $ sigil. |
|
|
Term
|
Definition
The lifespan and visibility of named entities, most often variables. Created by blocks. Most common scope is lexical. |
|
|
Term
|
Definition
Percentage of known SNPs that are captured by DNA sequencing. In the GIT 264-1 sequencing, sensitivity was 96.3%. |
|
|
Term
|
Definition
Can be used for transcriptome sequencing (mRNAs), genome sequencing, genome subset sequencing (exomes, certain restriction sites), DNA-protein interactions (ChIP sequencing), methylation sequencing, and small RNA sequencing. The smallest sequencer generates 400 million paired reads from 400 million molecules, which each read 150 bp, or 300 bp for a paird end: this is equal to 120 Gbp. The largest sequencer generates 1,800 Gbp. |
|
|
Term
Single nucleotide polymorphism (SNP) |
|
Definition
A site in the DNA which differs between two individuals at one nucleotide. It is the primary tool used for DNA investigation. |
|
|
Term
|
Definition
A highly conserved gene that encodes an epithelial Cl-/HCO3- exchanger. Mutation D652N changes aspartic acid to asparagine, causing congential chloride-losing diarrhea. Case study GIT 264-1 had a homozygous variant in this gene. Screening 39 other patients with suspected Bartter syndrome, it was found that 5 had homozygous mutations in this gene: 3 with watery diarrhea and no renal losses, and 2 with high stool chloride levels. |
|
|
Term
|
Definition
A very short nucleotide sequence that can target and regulate sequences of DNA. |
|
|
Term
|
Definition
Identifies SNPs and indels from .SAM outputs. Includes mpileup from SAMTOOLS, and HaplotypeCaller from GATK. Output is variant call format. |
|
|
Term
|
Definition
Percentage of SNPs that are sequenced correctly. A reference genome is used as a cross-reference. In the GIT 264-1 sequencing, specificity was 98.6%. |
|
|
Term
|
Definition
A type of mutation. In the GIT 264-1 sequencing, there were 84. |
|
|
Term
|
Definition
https://github.com/alexdobin/STAR
An alignment software. |
|
|
Term
|
Definition
A series of characters, including numbers, surrounded by either single or double quotes. Single quoted strings are taken as is, with contents not interpolated. Double quoted strings are interpolated. Interpolation involves identification and replacement of special characters with contents by the perl interpreter. |
|
|
Term
|
Definition
"Wobble"
A type of mutation where there is no change in amino acid sequence. In the GIT 264-1 sequencing, there were 6,462, and 253 of these were novel. |
|
|
Term
|
Definition
Caused by genetic variation, epigenetics, and the environment. In this course, the focus is on genetic variation. It is important to refer to a specific attribute and population, because different populations may have different variation. |
|
|
Term
Transcription factor binding (TFB) site |
|
Definition
An upstream enhancer site with a high conservation score. A motif that occurs before gene sequences. Can change or eliminate transcription binding when it is mutated. |
|
|
Term
|
Definition
A common disease with environmental and genetic factors that contribute to it. |
|
|
Term
|
Definition
A common disease with environmental and genetic factors that contribute to it. |
|
|
Term
|
Definition
A kernel and an operating system. A platform of file management infrastructure and software libraries upon which applications can be built and interact. There are several standards that are variously built. |
|
|
Term
Variant call format (.vcf) |
|
Definition
Output files of SNP detection software. A TAB-delimitated format with each data line consisting of the following fields:
1. CHROM: Chromosome name.
2. POS: The left-most position of the variant.
3. ID: Unique variatn identifier.
4. REFthe: Reference allele.
5. ALTthe: Alternate allele(s), comma separated.
6. QUAL: Variant/reference quality.
7. FILTER: Filters applied.
8. INFO: Information of variant, semicolon separated.
9. FORMAT: Format of genotype fields. Optionally colon separated.
10. SAMPLE: Sample genotypes and per-sample information. |
|
|
Term
|
Definition
Differences between things. |
|
|
Term
|
Definition
Its genome is 15,000 Mbp. |
|
|
Term
|
Definition
Runs a block of code multiple times on different pieces of data. Evaluates a condition adn runs a block if the condition is true, like an if statement.
While (condition = true) {do this} |
|
|
Term
|
Definition
Popular among consumers. The skin colour is a result of accumulation of carotenoid, which is a sign of good health in natural populations. The gene was found to be on chromosome 24, and chromosomal sites close to the gene were found using results from a backcross of F1 to the yellow parent; SNP 1 was found to be 20 cM away from the gene, and SNP 2 was 9 cM away from the gene. In close association with a SNP in APOA1. Allelic variation at any gene upstream or downstream of APOA1 could cause the trait difference. A SNP in the BCDO2 gene is found in chicken breeds with yellow skin, and not in breeds with white skin. The dominant allele is the white allele, which is opposite to what one might expect. It could have one or more nucleotides that changes an amino acid important for enzyme function or a sequence important for transcriptional regulation. |
|
|