Deep data analysis has become indispensable to businesses. To remain competitive, companies of all sizes rely on analytics tools to glean insights from disparate data, monitor their KPIs and provide reports to support sound decision-making. Underpinning all these efforts are data warehouses, specialized computer systems designed to store data efficiently, securely and quickly deliver simultaneous query results to data analysts and business decision-makers. Show
What Is a Data Warehouse?A data warehouse is a computer system designed to store and analyze large amounts of structured or semi-structured data. It serves as a central repository, accessible to authorized business users who rely on analysis to make better-informed decisions. A data warehouse is a key component of most business intelligence (BI) strategies. Data is routinely transformed and loaded into a data warehouse from various transactional systems, relational databases and other sources. Data engineers and scientists, business analysts and decision-makers access the data using BI tools, as well as other analytics applications like machine learning, and use it to populate dashboards and generate reports. Key Takeaways
Data Warehouses DefinedData warehouses are computer systems that used to store, perform queries on and analyze large amounts of historical data, which often come from multiple sources. Over time, it builds a historical record that can be invaluable to data scientists and business analysts. And because data entering a data warehouse goes through a series of cleaning and prepping processes, the data stored is of a high quality. Thus, the data warehouse’s records are often considered an organization’s definitive source of accurate data. Data warehouses usually include:
Video: What Is a Data Warehouse?
How Does a Data Warehouse Work?A data warehouse transforms relational data and other data source into multidimensional schemas for the sole purpose of analyzing. During this transformation, metadata is created to add speed to queries and searches. A semantic layer rests on top of this data layer, to organize and map complex data into familiar business language like ‘product’ or ‘customer’ so analysts can quickly build analyses without needing to know database table names. Finally, an analytics layer rests on top of the semantic layer to give authorized users access to the data, visualize it and interpret it. What Are Data Warehouses Used For?A data warehouse is used to analyze many different types of business data in a non-production environment. Using a data warehouse instead allows the operational databases to continue to record transactions and support the business. Companies use data warehouses to discover patterns, trends, outliers and other relationships in their data that develop over time. Other major advantages of a data warehouse are that it can analyze data from multiple sources and extract data from different types of storage systems. It also safeguards the integrity of a company’s data by allowing businesspeople to query it without accidentally altering or disturbing a production environment in any way. When to Use a Data WarehouseWhile there a myriad of good reasons to use a data warehouse, these four stand out:
Data Warehouses versus Data LakesA data warehouse can analyze vast amounts of relational data from many different sources, including transactional systems, operational databases and line-of-business applications. This can amount to hundreds of gigabytes and even petabytes (trillions of bytes) of data. Since the data is highly curated, it can serve as the company’s gold standard or definitive version of information. Common applications include BI analytics and graphic visualizations. A data lake, on the other hand, can be used to analyze all different types of data, including both structured (such as the data found in a relational database) and unstructured (such as the bits and bytes that comprise a video, a text message or a social media posting). This may also include raw data that has not been scrubbed, deduped or curated. Common data lake applications include machine learning, data discovery, big data analysis and profiling. Data Warehouses versus DatabasesDatabases are geared to create a record of transactions as they occur. They capture data “as is” from a single source, such as a credit card processing system. They do this continuously, in real time, as the transactions are processed. Data warehouses, in comparison, are designed to perform analytics on vast amounts of data from many different sources. As opposed to registering individual data entries at top speed, data warehouses are optimized to rapidly query large volumes of that data after it has been recorded. Data Warehouses versus Data MartsA data mart is a subset of a data warehouse that is dedicated to the needs of a specific function or business unit, like finance, marketing or sales. A data mart is smaller and more specialized than a full-fledged data warehouse, and it aggregates data from fewer sources. It can be set up as a separate, discrete system or as part of a larger data warehouse. How Do Data Warehouses, Databases and Data Lakes Work Together?Many businesses use a combination of databases, data lakes and data warehouses to store and analyze their data. The data may be recorded in their operational databases and then fed to their data warehouses for further analysis. But not all of their data comes from a structured database that stores data in a tabular format. Some applications, like big data analytics, full text search and machine learning, can make use of unstructured data, such as phone calls and handwritten notes. This kind of data is captured and fed into a company’s data lake, where it can be prepped for further analysis in the data warehouse. Functions of a Data WarehouseA data warehouse is specially designed to perform data analytics. This usually entails sorting through large amounts of data from different sources in order to ferret out different trends and relationships captured by the data. It has two core functions:
Taken together, these basic functions allow a wide range of analytics tools to integrate various kinds of data from many different sources and then examine it to answer questions, spot business trends and predict future performance. Types of Data WarehousesOriginally, all data warehouses were on-premises, but, like other information technology, they are rapidly migrating into the cloud. Here is a look at the options and what each one has to offer. On-premises data warehouse.With an on-premises approach, all the hardware and software required is purchased, licensed, deployed and maintained by the business that makes use of them. This approach is still in use and offers organizations several advantages:
Data warehouse appliance.One type of on-premises data warehouse is a data warehouse appliance. These self-contained hardware devices enable companies to more easily scale their data warehouse infrastructure to support their business analytics requirements as they grow and expand. However, these appliances, as well as on-premises systems in general, are being replaced as companies of all sizes migrate to the newest type of data warehouse. Cloud data warehouses.Like all cloud-based applications, cloud data warehouses don’t require an organization to purchase or maintain any hardware or software. A business simply pays for the subscription, storage space and computing power it needs at a given time. Expanding the capacity of a cloud data warehouse is a simple matter of adding more cloud resources; there’s no need to employ people to administer or maintain the underlying technology infrastructure since these tasks are handled by the cloud service provider. Taking a cloud-based approach to data warehousing offers a company numerous benefits. These include:
Data Warehouse ArchitectureThe design or architecture of a data warehouse typically consists of three tiers:
While those three layers remain consistent, the architecture of any individual data warehouse usually include modifications specific to a company’s needs. Starting with the basics, all data warehouses include a central database to store metadata, summary data and raw data. That’s the repository that takes in data and is accessed by business decision makers for analysis. Additional approaches build on this simple architecture, including:
Data Warehouse SchemaAll data warehouses are based on a schema, which is a type of blueprint or logical description of how the data is organized. It includes the name and description of the different kinds of records that the warehouse holds. There are three basic models: Star schema.In a star schema, data tables are one-dimensional — that is, each table contains data describing a single attribute, such as time, location or units sold. Snowflake schema.A snowflake schema is more complex but also takes up less storage space and is easier to maintain. Its data tables are multidimensional; instead of a single attribute, they are subdivided into additional tables that provide related attributes. So, for example, a table on sales might include a location attribute, which is linked to another table that provides further details, such as city and street. The location’s table city entry may also be linked to yet another table, which holds data about the state or province and country where the city is located. Galaxy schema.A constellation schema, also known as a galaxy schema, is something of a cross between the star and snowflake in that it can contain data tables that are both one- and multidimensional. Benefits of a Data WarehouseThe primary or overarching benefit of a data warehouse is that it allows a company to analyze large amounts of assorted types of data and maintain a historical record of it. More specifically, the benefits of a data warehouse include the ability to:
Disadvantages of a Data WarehouseAs many benefits as they offer, data warehouses also have some drawbacks. Some of the chief concerns are:
Data Warehouse ExamplesHere’s how data warehousing is often used to support business operations in three different industrial sectors:
History of Data WarehousingAs computer systems became more ubiquitous and complex, and the amount of data that they processed began to multiply, requirements to store, access and analyze that data became much more demanding. The earliest efforts to warehouse data more efficiently began in response to this and date back to the time when mainframes ruled the data-processing world and microprocessor-based personal computers had yet to be invented. Here are some of the key milestones in the evolution of the data warehouse:
The contemporary concept of a data warehouse emerged in the late 1980s, when IBMers Paul Murphy and Barry Devlin developed the Business Data Warehouse. However, it is William Inmon who is credited with being the father of the data warehouse for first elaborating on the concept and linking it to the notion of a “Corporate Information Factory.” Future of Data WarehousesThe future of the data warehouse is in the cloud. Successful outcomes with big data and data analytics are whetting the corporate world’s appetite for more data. By locating it within cloud computing services, a business can cost-effectively scale its data warehouse capacity to keep pace with its ever-growing analytics requirements. Moreover, a company with its data warehouse in the cloud will no longer have to concern itself with keeping its analytics software up-to-date — a major concern with on-premises data warehouses. That concern will completely evaporate once responsibility is handed off to a service provider. For these and other reasons — including enhanced security and lower start-up costs — cloud-based data warehouse implementations will become de rigueur. A New Data Warehouse Is Born for Today’s Data-Centric BusinessesBusinesses today cannot remain competitive without leveraging their data. Companies of all types and sizes rely on data-driven insights to stay current with their offerings and relevant to their customers. To take full advantage of their data and extract all the insights they can, companies need a lower cost, simpler-to-deploy, easier-to-use, cloud-based data warehouse. Enter the NetSuite Analytics Warehouse: a new cloud-based data warehouse based on Oracle Autonomous Data Warehouse and Oracle Analytics Cloud technology but optimized for use with NetSuite’s business applications served from the cloud. The NetSuite Analytics Warehouse comes prebuilt to automatically transform and visualize NetSuite application data into formats for the data warehouse, where it can be combined and analyzed together with data from multiple external sources to yield more powerful business insights. It will perform queries quickly and provide increased flexibility for data analysts and business decision makers to slice and dice their data to meet a variety of needs. As commerce increasingly shifts to the digital realm, businesses must equip everyone from product engineers to sales managers with data insights that help them perform their jobs more effectively and engage in the kind of data analyses that leads to innovative work that pushes a business forward. Otherwise, they will simply fall behind those organizations that do. Thus, well-designed data warehouses that provide the foundation for business intelligence have become a necessity for organizations of all sizes. Data Warehouse FAQsWhat is a data warehouse used for? A data warehouse can be used to analyze many different types of business data without the limitations of a conventional database. Unlike most relational databases, it can analyze data from multiple sources and extract data from different types of storage systems. It also safeguards the integrity of a company’s data by allowing users to query it without accidentally altering or disturbing it in any way. What is an example of a data warehouse? In the retail industry, data warehouses are used for forecasting and to provide business intelligence. Uses include tracking product performance, determining optimal pricing, evaluating promotional strategies and analyzing customer buying patterns. What is the data warehousing process? A data warehouse centralizes and consolidates large amounts of data from multiple sources. Over time, it builds a historical record that can be invaluable to data scientists and business analysts. The data stored is of the highest quality and the data warehouse’s records are often considered definitive, serving as an organization’s “single source of truth.” Many businesses use a combination of databases, data lakes and data warehouses to store and analyze their data. The data may be recorded in their operational databases and then fed to their data warehouses for further analysis. Why should the data warehouse be separate from the operational database?A major reason for such a separation is to help boost the high implementation of both systems. An operational database is created and tuned from known functions and workloads, including indexing and hashing using primary keys, searching for specific records, and optimizing “canned” queries.
Which one is not the three main types of data warehouses Dwh are?The three main types of data warehouses are enterprise data warehouse (EDW), operational data store (ODS), and data mart.
Why are data warehouses necessary quizlet?A data warehouse serves as a repository to store historical data that can be used for analysis. OLAP is Online Analytical processing that can be used to analyze and evaluate data in a warehouse. The warehouse has data coming from varied sources.
Why were data warehouses created?A data warehouse is designed to allow its users to run queries and analyses on historical data derived from transactional sources. Data added to the warehouse does not change and cannot be altered. The warehouse is the source that is used to run analytics on past events, with a focus on changes over time.
|