Term
|
Definition
a term reffering to the huge amount of data available today. however, often too big and unstructured to utalize with conventional database software |
|
|
Term
the amount of data on corporate hard drives doubles ______ |
|
Definition
the amount of data on corporate hard drives doubles every 6 months |
|
|
Term
business intelligence (BI) |
|
Definition
a term combining aspects of reporting, data exploration and ad hoc queries, and sophisticated data modeling and analysis |
|
|
Term
|
Definition
the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based managemtn to drive decisions and actions |
|
|
Term
why is moving early key in establishing competitive advantage in using data to create models? |
|
Definition
there's no monopoly on math, advantages based on capabilities and data that others can acquire will be short lived. |
|
|
Term
what is key in establishing operationally effective data to gain true strategic positioning? |
|
Definition
differenetiation in distinguishing operationally effective data use is key in true strategic positioning |
|
|
Term
what is a huge limiting factor of BI? |
|
Definition
getting data into a form where it can be used, analyzed and turned into information is a limiting factor of BI |
|
|
Term
|
Definition
outdated information systems that were not designed to share data, aren't compatible with newer technologies, and aren't aligned with the firm's current business needs. |
|
|
Term
what's the issue with most transactional databases? |
|
Definition
most transactional databases aren't set up to be simultaneously accessed for reporting and analytics, so this forces the firm to export the data to a warehouse or data mart |
|
|
Term
|
Definition
a set of databases designed to support decision making in an organization. structured for fast online queries and exploration, may aggregate enormous amounts of data from many different operational systems |
|
|
Term
|
Definition
a database focused on addressing the concerns of a specific problem like increasing customer retention or improving product quality |
|
|
Term
what needs to happen before a firm tackles changing its system? |
|
Definition
a firm needs to have its business goals clearly defined before it can begin to design, develop, deploy, and maintain it's system. |
|
|
Term
what questions should be ask when planing to change a system? |
|
Definition
after establishing a clear goal, a business needs to address questions concerning data relevance, sourcing, quantity, quality, hosting, and governance. |
|
|
Term
|
Definition
an open source project that was created to analyze massive amounts of raw information better than traditional, highly strucutred databases. consisting of some half dozen separate software pieces. |
|
|
Term
four primary advatnages to hadoop: |
|
Definition
1. flexibility: can absorb any type of data from any source 2. scalability: can start with just one machine, but allows for others to join and combine to work together for storage and analysis 3. cost effectiveness: open source 4. fault tolerance: no single point of failure |
|
|
Term
|
Definition
identifying and retrieving relevant electronic info to support litigation efforts, something a firm should account for in archiving and data storage plans. |
|
|
Term
main problem with creating large data warehouses? |
|
Definition
large data warehouses are complex, costly, and can take years to build. |
|
|
Term
query and reporting tools |
|
Definition
designed to present users with a subset of requested data that has been selected, sorted, ordered, calculated, and compared as needed. these tools help managers to see what's happening inside their organizations |
|
|
Term
|
Definition
provide regular summaries of information in a predetermined format. often developed by IS staff and can be difficult to alter |
|
|
Term
|
Definition
tools that put users in control so they can create custom reports on an as-need basis |
|
|
Term
|
Definition
a heads up display of critical indicators, letting managers get a graphical glance at key performance metrics. |
|
|
Term
online analytical processing (OLAP) |
|
Definition
a method of querying and reporting that takes data from standard relational databases, calculates and summarizes it, and then stores it on a data cube. this makes it extremely fast and allows users to 'slice and dice' their data by exploring and comparing data across multiple factors and uncover new insights |
|
|
Term
|
Definition
a special data base used in OLAP |
|
|
Term
|
Definition
non-trivial discovery of novel, valid, comprehensible and potentially useful patterns from data. the process of using computers to identify hidden patterns and to build models from large data sets. |
|
|
Term
key areas where businesses are leveraging data mining: |
|
Definition
1. customer segmentation 2. marketing and promotion targeting 3. market basket analysis 4. collaborate filtering 5. customer churn 6. fraud detection 7. financial modeling 8. hiring and promotion |
|
|
Term
|
Definition
determining which customers are likely to leave and what tactics can help the firm avoid this |
|
|
Term
what two conditions need to be present in order for data mining to work? |
|
Definition
1. the organization must have clean consistant data 2. the events in that data should reflect current and future trends |
|
|
Term
|
Definition
deciving your system leads to bad data and bad data creates bad models, which leads to bad estimates |
|
|
Term
problem of hisotrical consistency: |
|
Definition
computer models are blind when faced with black swans |
|
|
Term
|
Definition
creating a model with so many variables that 1. the solution arrived at might only work on the subset of data you've created 2. you might be looking at a meaningless statistical fluke |
|
|
Term
how do you test to see if you're looking at a random occurrence? |
|
Definition
to test if you're looking at a random occurrence, divide your data. use one portion to build your model and the other to verify your results. |
|
|
Term
three critical skills needed by an effective data mining and business analytics team |
|
Definition
1. information technology- understanding how to pull data together 2. statistics- to build models and interpret strength and validity of results 3. business knowledge- to help set goals, requirements, offer deeper insight into what the data is really saying about the business environment |
|
|
Term
|
Definition
an AI network that hunt down and expose patterns and build models to exploit findings |
|
|
Term
|
Definition
AI systems that leverage experts' knowledge to create if/then rules in order to perform a task in a way that mimicks applied human expertise. they improve decision making in non-experts |
|
|
Term
|
Definition
model building techniques where computers examine many potential solutions to a problem, modifying various models and comparing the models to look for the best alternative. |
|
|
Term
|
Definition
Walmarts proprietary system that records sales and automatically triggers inventory reordering, scheduling and delivery. main reason for walmart's incredible inventory turnover rate of 8.5 (selling it's entire inventory roughtly every 6 weeks) |
|
|
Term
|
Definition
too little or too much inventory |
|
|
Term
walmart uses data mining to: |
|
Definition
1. keep product mix right in varying conditions (pop tarts during huricanes) 2. make operational forecasts (how many cashires are needed) |
|
|
Term
how does walmart use hadoop |
|
Definition
walmart leverages its hadoop-based data to support it's social media data mining efforts |
|
|
Term
who does walmart share their data with? |
|
Definition
walmart gives suppliers access tot their products' walmart performance across metrics |
|
|
Term
how does walmart keep data competitors off their trail? |
|
Definition
walmart custom builds large portions of its information systems and closely guards it's infrastructure |
|
|
Term
what is a current challenge for walmart |
|
Definition
walmart is reaching a platoo, they need to find huge markets or dramatic cost savings in order to boost profits and continue to move its stock price higher |
|
|
Term
|
Definition
too aggressive and big: 1.subpar wages 2. poor labor conditions at some of their suppliers 3. catch 22 for suppliers, miss out on retail sales or such low prices they end up cannibalizing their own sales with other retailers 4. threatening mom and pop stores |
|
|
Term
problems with operational data |
|
Definition
1. data is in too many places 2. data is dirty/missing values 3. data is non maintained consistently 4. data is hard to retrieve from legacy systems 5. too much data |
|
|
Term
|
Definition
integrate data from multiple sources and process it. the results are then formatted into reports used to improve decision making |
|
|
Term
|
Definition
data from operational databases, internal/external sources go to a data extraction/preparation program, and then to a warehouse |
|
|
Term
|
Definition
reporting systems, data-mining systems, knowledge management systems, expert systesm |
|
|
Term
business intellegence systems us __________ to provide reporting and analysis for organizational _______ |
|
Definition
business intelligence systems us data created by other systems to provide reporting and analysis for orginizational decision mkaing |
|
|
Term
how do data mining systems work? |
|
Definition
data mining systems use statistical techniques such as regression and decision tree analysis to look for patterns and relationships in order to predict outcomes. |
|
|
Term
|
Definition
dads that go shopping on thursday-saturday will stock up on dipers and beer |
|
|
Term
knowledge management systems |
|
Definition
used to share human knowledge and thus gain value from intellectual capital. used to foster innovation and increase company organizational responsiveness |
|
|
Term
|
Definition
how recently a customer ordered, how frequently they order, and how much they spend per order. |
|
|
Term
|
Definition
customers sorted by date of most recent purchase and then separated into fifths. the most recent fifth receiving a 1 and the least recent earning a 5 |
|
|
Term
|
Definition
customers are arranged by frequency, and split into fifths. most frequent fifth receives a 1, least frequent a 5 |
|
|
Term
|
Definition
customers are arranged by amount, and split into fifths. most expensive fifth receives a 1, least expensive a 5 |
|
|
Term
|
Definition
data mining technique for determining sales patterns. shows products customers buy together. based of association rules of probablity, support, confidence, and lift |
|
|
Term
|
Definition
associations and/or correlations amoung large set of data items. provided in "if then" statements and the rules are probablistic |
|
|
Term
|
Definition
likelihood that two items will be purchased together |
|
|
Term
|
Definition
frequency product appears in a transaction database |
|
|
Term
|
Definition
likelyhood that a person buying product A will also buy product B |
|
|
Term
|
Definition
how much more likely is that person who buys products A + B together than the likelihood that anyone who walks into the store will buy product B |
|
|
Term
why do companies risk lawsuits over privacy infringment? |
|
Definition
the cost of a lawsuit is completely bypassed by the profits gained from leveraging this private information |
|
|
Term
what's the solution to the data dilemma? |
|
Definition
the data dilemma is data warehouses |
|
|
Term
|
Definition
provide information for improving decision making. include: reporting, data-mining, knowledge management, expert systems |
|
|
Term
|
Definition
look at data for patterns with human eye, build association rules, analyze associations by looking at things like confidence and lift |
|
|