Term
|
Definition
any (possible Multi-step) action that reads from and/or writes to a database. |
|
|
Term
|
Definition
One in which all of the SQL statements are completed successfully. |
|
|
Term
Consistent Database State |
|
Definition
One in which all data integrity contraints are satisfied. |
|
|
Term
Properties of a transaction |
|
Definition
Atomicity, consistency, Isolation, Durability |
|
|
Term
|
Definition
All transaction operations must be completed. |
|
|
Term
|
Definition
When a database transaction is completed, the database must be in a consistent state. |
|
|
Term
|
Definition
Data used during the execution of a transaction cannot be used by a second transaction until the first one is completed. |
|
|
Term
|
Definition
Once transaction changes are commited, they cannot be undone or lost due to subsequent failure. |
|
|
Term
|
Definition
Permanently records all changes in the database. |
|
|
Term
|
Definition
Aborts all uncommited changes
Databased is rolled to it's previos state |
|
|
Term
|
Definition
Maintained by a DBMS to support recovery to a consistent state. |
|
|
Term
|
Definition
The process of managing simultaneous operations on the database without having them interfere with one another. |
|
|
Term
|
Definition
Occurs when a successfully completed update is overwritten by another transaction. |
|
|
Term
|
Definition
Occurs when a transaction accesses the intermediate results of another transaction before they are commited-and the second transaction is then rolled back. |
|
|
Term
|
Definition
Occurs when a transaction reads several values, but a different transaction updates some of them in the midst of this process. |
|
|
Term
|
Definition
A schedule of a trasnaction's operations in which the interleaved execution of all active transactions yields the same results as if those transactions were executed in serial order. |
|
|
Term
|
Definition
The size of the locked resource:
Database-level..Table-level..Page-level..Row-level |
|
|
Term
|
Definition
Prohibits other users from reading the locked resource. |
|
|
Term
|
Definition
Allows other users to read the locked resources, but they cannot update it. |
|
|
Term
|
Definition
Assumes that no transaction conflit(s) will occur.
DBMS proccesses a transaction to a temporary filel checks weather conflict occured. |
|
|
Term
|
Definition
Assumes conflict(s) with occur:
Lcoks are issued before a transaction is processed, and then the locks are released. |
|
|
Term
|
Definition
Guarantees serializability
one of the most common techniques used to achieve this
Transactions are allowed to obtain as many locks as necessary (growing phase)
Once the first lock is released (shrinking phase), no additional locks can be obtained
Two-phase locking doesn't prevent deadlocks |
|
|
Term
|
Definition
An impasse that may result when two (or more) transactions are waiting for locks held by the other to be released. |
|
|
Term
|
Definition
Abort a transaction if possibility of deadlock
Reschedule transaction for later execution |
|
|
Term
|
Definition
DBMS periodically tests database for deadlocks
If found, one transaction ("victim") is rolled back |
|
|
Term
|
Definition
Transactions obtain all needed locks before execution |
|
|
Term
|
Definition
A unique identifier created by DBMS that indicates the relative starting time of a transaction. |
|
|
Term
|
Definition
A subroutine avaliable to applications accessing a relation database system. A stored procedure(sproc or SP) is acutally stored in the database. |
|
|
Term
Stored Procedure can input and return? |
|
Definition
|
|
Term
Stored Procedures can be called from? |
|
Definition
Standard languages (Java, C#)
Scripting Languages (Javascript,VBScript,PHP)
SQL Command prompt (SQL*Plus) |
|
|
Term
Advantages of Stored Procedures |
|
Definition
Performance
Compiled Once
Server Side computation
executable code is cached and shared
grouping SQL statements allow for single call exection |
|
|
Term
Persistent Stored Modules |
|
Definition
SQL itself does not support control statements such as looping operations
SQL-99 Standard defines the use. |
|
|
Term
|
Definition
A procedure that is automatically executed by the RDBMS when a given data manipulation event occurs.
Often used to enforce referential integrity. |
|
|
Term
Business Intelligence (BI) |
|
Definition
A set of methodologies, processes, architectures, and technologies, that transform raw data into meaningful and useful information. |
|
|
Term
Who coined the term Business Intelligence and when? |
|
Definition
Gartner Group -> Early 1990s |
|
|
Term
Typical Major Components of Business Intelligence |
|
Definition
Data extraction, transformation, and loading tools
Data store (data warehouse or Data mart)
Data Query and analysis tool (OLAP)
Data Presentation and visualtion tools (dashboard) |
|
|
Term
Major Software Vendors of BI |
|
Definition
Microsoft, IBM, Oracle, SAP |
|
|
Term
|
Definition
(Transactional databases)
Stored in highly normalized tables in a relational database
Dynamically updated
Focus on traditional information systems |
|
|
Term
|
Definition
(Data Warehouses)
Stored in formats that facilitate data extraction, data analysis, and decision making
Often aggregated
Often with redundancies. |
|
|
Term
|
Definition
1.Collect and storing ops data
2.Aggregating the ops data into decision support data
3.Analyzing the decision support data to generate info
4.Presentating the info to the end user to support decision-making
5.Making business decision (and generating more data)
6.Monitoring results to evaluate outcomes of the business decisions |
|
|
Term
|
Definition
A database optimized for data analysis and read-only query processing. |
|
|
Term
Master Data Management (MDM) |
|
Definition
Provides for a comprehensive and consistent definition of all data in an organization
Ensures uniform and consistent views of all data
Supports proper Governance
For controlling and monitoring business health
Creates accountability
|
|
|
Term
Data Warehouse Characteristics |
|
Definition
Integrated - Consistent format and meaning.
Subject-oriented - Organized to answer questions.
Time-variant - captures and represents the flow of data over time.
Nonvolatile - Once the data enters the warehouse, it's never removed.
~1 to 3 years to implement |
|
|
Term
|
Definition
A small, single-subject data warehouse subset that provides decision support to a small group of people. |
|
|
Term
Data Mart Characteristics |
|
Definition
Less organizational commitment
Lower Cost
Shorter implementation time
~6 months - 1 year |
|
|
Term
Online Analytical Processing (OLAP) |
|
Definition
Graphical User Interface
Analytic processing logic
Data-processing logic
Capacity for multi-deminsional analysis
Used with both transactional and Data warehouses
|
|
|
Term
|
Definition
Data is stored in multi-dimensional arrays
Typically visualized as being stored as a Data Cube |
|
|
Term
|
Definition
Data Retrieval is much quicker than with standard relational databases
Provides opportunity to "Slice and dice" data
Foundation for multi-dimensional OLAP |
|
|
Term
|
Definition
Associated with a particular type of (aggregated) data.
Ex. Sales Table |
|
|
Term
|
Definition
Attributes provide descriptive information about the facts within a given dimension.
Ex. Product, time, and location tables. |
|
|
Term
|
Definition
Non-trival extraction of implicit, previously unknown and potentially useful information from data. |
|
|
Term
Motivation of Data Mining |
|
Definition
Ideas come from from many disciplines including machine learning/AI, pattern recognitions, statistics, and database systems |
|
|
Term
Supervised algorithims (Classification) |
|
Definition
Learning by example
Use of training data which has correct answers
Create a model by running the algorithm on the training data |
|
|
Term
Unsupervised algorithms (Clustering) |
|
Definition
Does not use training data
Classes may not be known in advance |
|
|
Term
|
Definition
Given a collection of records
Each record contains a set of attributes, one of the attributes is the dependent variable/class
Find a model to predict the class attribute as a function of the values of the other attributes
Goal: previously unseen records should be assigned to a class as accurately as possible
|
|
|
Term
|
Definition
Basic idea:
Look at characteristics / attributes
“If it walks like a duck and quacks like a duck, then it’s probably a duck”
|
|
|
Term
Nearest-Neighbor Classifier |
|
Definition
Requires three things
-The set of stored records
-Distance Metric to compute the distance between records
-The value of k, the number of nearest neighbors to retrieve
To classify an unknown record:
-Compute distance to other training records
-Identify k nearest neighbors
-Use class labels of nearest neighbors to determine the class label of unknown record (e.g., by taking majority vote, weighted distance)
|
|
|
Term
|
Definition
If k is too small, the model is sensitive to noise
If k is too large, neighborhood may include too many points from other classes
|
|
|
Term
|
Definition
An artificial neural network (ANN), usually called neural network (NN), is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks.
They are usually used to model complex relationships between inputs and outputs or to find patterns in data.
|
|
|
Term
|
Definition
Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that
Data points in one cluster are more similar to one another
Data points in separate clusters are less similar to one another
Similarity Measures:
Euclidean Distance (if attributes are continuous)
Other Problem-specific Measures
|
|
|
Term
|
Definition
Clustering Points: Twitter feeds / blog comments
Similarity Measure: How many words are common in these “documents” (after some word filtering)
Applications:
Identify issues with a product more quickly and with greater detail
Identify the occurrence of / details about a disaster event as it is in the process of occurring (used for flooding in Oklahoma and North Dakota)
|
|
|