Welcome!

IBM Cloud Authors: Yeshim Deniz, Flint Brenton, Scott Allen, Liz McMillan, Elizabeth White

Blog Feed Post

Deja VVVu: Others Claiming Gartner’s Construct for Big Data

By

This article originally appeared on the Gartner Blog Network in January 2012 and is reprinted here with permission from Gartner and its author Doug Laney

In the late 1990s, while a META Group analyst (Note: META is now part of Gartner), it was becoming evident that our clients increasingly were encumbered by their data assets.  While many pundits were talking about, many clients were lamenting, and many vendors were seizing the opportunity of these fast-growing data stores, I also realized that something else was going on. Sea changes in the speed at which data was flowing mainly due to electronic commerce, along with the increasing breadth of data sources, structures and formats due to the post Y2K-ERP application boom were as or more challenging to data management teams than was the increasing quantity of data.

In an attempt to help our clients get a handle on how to recognize, and more importantly, deal with these challenges I began first speaking at industry conferences on this 3-dimensional data challenge of increasing data volume, velocity and variety.  Then in late 2000 I drafted a research note published in February 2001 entitled 3-D Data Management: Controlling Data Volume, Velocity and Variety.

Fast forward to today:  The “3V’s” framework for understanding and dealing with Big Data has now become ubiquitous.  In fact, other research firms, major vendors and consulting firms have even posited the 3Vs (or an unmistakable variant) as their own concept.  Since the original piece is no longer available in Gartner archives but is in increasing demand, I wanted to make it available here for anyone to reference and cite:

Original Research Note PDF: 3-D Data Management: Controlling Data Volume, Velocity and Variety

Date: 6 February 2001     Author: Doug Laney

3-D Data Management: Controlling Data Volume, Velocity and Variety. Current business conditions and mediums are pushing traditional data management principles to their limits, giving rise to novel and more formalized approaches.

META Trend: During 2001/02, leading enterprises will increasingly use a centralized data warehouse to define a common business vocabulary that improves internal and external collaboration. Through 2003/04, data quality and integration woes will be tempered by data profiling technologies (for generating metadata, consolidated schemas, and integration logic) and information logistics agents. By 2005/06, data, document, and knowledge management will coalesce, driven by schema-agnostic indexing strategies and portal maturity.

The effect of the e-commerce surge, a rise in merger & acquisition activity, increased collaboration, and the drive for harnessing information as a competitive catalyst is driving enterprises to higher levels of consciousness about how data is managed at its most basic level.  In 2001-02, historical, integrated databases (e.g. data warehouses, operational data stores, data marts), will be leveraged not only for intended analytical purposes, but increasingly for intra-enterprise consistency and coordination. By 2003-04, these structures (including their associated metadata) will be on par with application portfolios, organization charts and procedure manuals for defining a business to its employees and affiliates.

Data records, data structures, and definitions commonly accepted throughout an enterprise reduce fiefdoms pulling against each other due to differences in the way each perceives where the enterprise has been, is presently, and is headed.  Readily accessible current and historical records of transactions, affiliates (partners, employees, customers, suppliers), business processes (or rules), along with definitional and navigational metadata (see ADS Delta 896, 21st Century Metadata: Mapping the Enterprise Genome, 7 Aug 2000) enable employees to paddle in the same direction.  Conversely, application-specific data stores (e.g. accounts receivable versus order status), geographic-specific data stores (e.g. North American sales vs. International sales), offer conflicting, or insular views of the enterprise, that while important for feeding transactional systems, provide no “single version of the truth,” giving rise to inconsistency in the way enterprise factions function.

While enterprises struggle to consolidate systems and collapse redundant databases to enable greater operational, analytical, and collaborative consistencies, changing economic conditions have made this job more difficult.  E-commerce, in particular, has exploded data management challenges along three dimensions: volumes, velocity and variety.  In 2001/02, IT organizations must compile a variety of approaches to have at their disposal for dealing with each.

Data Volume

E-commerce channels increase the depth and breadth of data available about a transaction (or any point of interaction). The lower cost of e-channels enables and enterprise to offer its goods or services to more individuals or trading partners, and up to 10x the quantity of data about an individual transaction may be collected—thereby increasing the overall volume of data to be managed.  Furthermore, as enterprises come to see information as a tangible asset, they become reluctant to discard it.

Typically, increases in data volume are handled by purchasing additional online storage.  However as data volume increases, the relative value of each data point decreases proportionately—resulting in a poor financial justification for merely incrementing online storage. Viable alternates and supplements to hanging new disk include:

  • Implementing tiered storage systems (see SIS Delta 860, 19 Apr 2000) that cost effectively balance levels of data utility with data availability using a variety of media.
  • Limiting data collected to that which will be leveraged by current or imminent business processes
  • Limiting certain analytic structures to a percentage of statistically valid sample data.
  • Profiling data sources to identify and subsequently eliminate redundancies
  • Monitoring data usage to determine “cold spots” of unused data that can be eliminated or offloaded to tape (e.g. Ambeo, BEZ Systems, Teleran)
  • Outsourcing data management altogether (e.g. EDS, IBM)

Data Velocity

E-commerce has also increased point-of-interaction (POI) speed, and consequently the pace data used to support interactions and generated by interactions. As POI performance is increasingly perceived as a competitive differentiator (e.g. Web site response, inventory availability analysis, transaction execution, order tracking update, product/service delivery, etc.) so too is an organization’s ability to manage data velocity.  Recognizing that data velocity management is much more than a physical bandwidth and protocol issue, enterprises are implementing architectural solutions such as:

  • Operational data stores (ODSs) that periodically extract, integrate and re-organize production data for operational inquiry or tactical analysis
  • Caches that provide instant access to transaction data while buffering back-end systems from additional load and performance degradation. (Unlike ODSs, caches are updated according to adaptive business rules and have schemas that mimic the back-end source.)
  • Point-to-point (P2P) data routing between databases and applications (e.g. D2K, DataMirror) that circumvents high-latency hub-and-spoke models that are more appropriate for strategic analysis
  • Designing architectures that balance data latency with application data requirements and decision cycles, without assuming the entire information supply chain must be near real-time.

Data Variety

Through 2003/04, no greater barrier to effective data management will exist than the variety of incompatible data formats, non-aligned data structures, and inconsistent data semantics.  By this time, interchange and translation mechanisms will be built into most DBMSs. But until then, application portfolio sprawl (particularly when based on a “strategy” of autonomous software implementations due to e-commerce solution immaturity), increased partnerships, and M&A activity intensifies data variety challenges. Attempts to resolve data variety issues must be approached as an ongoing endeavor encompassing the following techniques:

  • Data profiling (e.g. Data Mentors, Metagenix) to discover hidden relationships and resolve inconsistencies across multiple data sources (see ADS898)
  • XML-based data format “universal translators” that import data into standard XML documents for export into another data format (e.g. infoShark, XML Solutions)
  • Enterprise application integration (EAI) predefined adapters (e.g. NEON, Tibco, Mercator) for acquiring and delivering data between known applications via message queues, or EAI development kits for building custom adapters.
  • Data access middleware (e.g. Information Builders’ EDA/SQL, SAS Access, OLE DB, ODBC) for direct connectivity between applications and databases
  • Distributed query management (DQM) software (e.g. Enth, InfoRay, Metagon) that adds a data routing and integration intelligence layer above “dumb” data access middleware
  • Metadata management solutions (i.e. repositories and schema standards) to capture and make available definitional metadata that can help provide contextual consistency to enterprise data
  • Advanced indexing techniques for relating (if not physically integrating) data of various incompatible types (e.g. multimedia, documents, structured data, business rules).

As with any sufficiently fashionable technology, users should expect the data management market place ebb-and-flow to yield solutions that consolidate multiple techniques and solutions that are increasingly application/environment specific. (See Figure 1 – Data Management Solutions) In selecting a technique or technology, enterprises should first perform an information audit assessing the status of their information supply chain to identify and prioritize particular data management issues.

Business Impact: Attention to data management, particularly in a climate of e-commerce and greater need for collaboration, can enable enterprises to achieve greater returns on their information assets.

Bottom Line: In 2001/02, IT organizations must look beyond traditional direct brute force physical approaches to data management.  Through 2003/04, practices for resolving e-commerce accelerated data volume, velocity and variety issues will become more formalized and diverse.  Increasingly, these techniques involve trade-offs and architectural solutions that involve and impact application portfolios and business strategy decisions.

###

Over the past decade, Gartner analysts including Regina Casonato, Anne Lapkin, Mark A. Beyer, Yvonne Genovese and Ted Friedman have continued to expand our research on this topic, identifying and refining other “big data” concepts. In September 2011 they published the tremendous research note Information Management in the 21st Century.  And in 2012, Mark Beyer and I developed and published Gartner’s updated definition of Big Data to reflect its value proposition and requirements for “new innovative forms of processing.” (See The Importance of ‘Big Data’: A Definition)

Doug Laney is a research vice president for Gartner Research, where he covers business analytics solutions and projects, information management, and data-governance-related issues. He is considered a pioneer in the field of data warehousing and created the first commercial project methodology for business intelligence/data warehouse projects. Mr. Laney is also originated the discipline of information economics (infonomics). 

Follow Doug on Twitter: @Doug_Laney

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder and partner at Cognitio Corp and publsher of CTOvision.com

@ThingsExpo Stories
SYS-CON Events announced today that EastBanc Technologies will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. EastBanc Technologies has been working at the frontier of technology since 1999. Today, the firm provides full-lifecycle software development delivering flexible technology solutions that seamlessly integrate with existing systems – whether on premise or cloud. EastBanc Technologies partners with p...
SYS-CON Events announced today that MangoApps will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device. For more information, please visit https://www.mangoapps.com/.
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in Embedded and IoT solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and ...
18th Cloud Expo, taking place June 7-9, 2016, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some...
SYS-CON Events announced today Object Management Group® has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that IBM Cloud Data Services has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. IBM Cloud Data Services offers a portfolio of integrated, best-of-breed cloud data services for developers focused on mobile computing and analytics use cases.
Cloud computing delivers on-demand resources that provide businesses with flexibility and cost-savings. The challenge in moving workloads to the cloud has been the cost and complexity of ensuring the initial and ongoing security and regulatory (PCI, HIPAA, FFIEC) compliance across private and public clouds. Manual security compliance is slow, prone to human error, and represents over 50% of the cost of managing cloud applications. Determining how to automate cloud security compliance is critical...
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data...
"What we see what happens when you have a completely networked society and the potential to now drive the value creation and the collaboration and the ecosystems that are possible when you start to be able to connect people and industries together in ways that have never been possible before," explained Esmeralda Swartz, VP of Marketing Enterprise & Cloud at Ericsson, in this SYS-CON.tv interview at @ThingsExpo, held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
WebRTC is bringing significant change to the communications landscape that will bridge the worlds of web and telephony, making the Internet the new standard for communications. Cloud9 took the road less traveled and used WebRTC to create a downloadable enterprise-grade communications platform that is changing the communication dynamic in the financial sector. In his session at @ThingsExpo, Leo Papadopoulos, CTO of Cloud9, will discuss the importance of WebRTC and how it enables companies to fo...
SYS-CON Events announced today that ContentMX, the marketing technology and services company with a singular mission to increase engagement and drive more conversations for enterprise, channel and SMB technology marketers, has been named “Sponsor & Exhibitor Lounge Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, New York. “CloudExpo is a great opportunity to start a conversation with new prospects, but what happens after the...
The IoT is changing the way enterprises conduct business. In his session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, discuss how businesses can gain an edge over competitors by empowering consumers to take control through IoT. We'll cite examples such as a Washington, D.C.-based sports club that leveraged IoT and the cloud to develop a comprehensive booking system. He'll also highlight how IoT can revitalize and restore outdated business models, making them profitable...
Customer experience has become a competitive differentiator for companies, and it’s imperative that brands seamlessly connect the customer journey across all platforms. With the continued explosion of IoT, join us for a look at how to build a winning digital foundation in the connected era – today and in the future. In his session at @ThingsExpo, Chris Nguyen, Group Product Marketing Manager at Adobe, will discuss how to successfully leverage mobile, rapidly deploy content, capture real-time d...
What a difference a year makes. Organizations aren’t just talking about IoT possibilities, it is now baked into their core business strategy. With IoT, billions of devices generating data from different companies on different networks around the globe need to interact. From efficiency to better customer insights to completely new business models, IoT will turn traditional business models upside down. In the new customer-centric age, the key to success is delivering critical services and apps wit...
As cloud and storage projections continue to rise, the number of organizations moving to the cloud is escalating and it is clear cloud storage is here to stay. However, is it secure? Data is the lifeblood for government entities, countries, cloud service providers and enterprises alike and losing or exposing that data can have disastrous results. There are new concepts for data storage on the horizon that will deliver secure solutions for storing and moving sensitive data around the world. ...
Join us at Cloud Expo | @ThingsExpo 2016 – June 7-9 at the Javits Center in New York City and November 1-3 at the Santa Clara Convention Center in Santa Clara, CA – and deliver your unique message in a way that is striking and unforgettable by taking advantage of SYS-CON's unmatched high-impact, result-driven event / media packages.
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, will provide an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life ...
SYS-CON Events announced today that MobiDev will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex software systems for startups and enterprises. Since 2009 it has grown from a small group of passionate engineers and business managers to a full-scale mobile software company with over 200 develope...
SoftLayer operates a global cloud infrastructure platform built for Internet scale. With a global footprint of data centers and network points of presence, SoftLayer provides infrastructure as a service to leading-edge customers ranging from Web startups to global enterprises. SoftLayer's modular architecture, full-featured API, and sophisticated automation provide unparalleled performance and control. Its flexible unified platform seamlessly spans physical and virtual devices linked via a world...
SYS-CON Events announced today that BMC Software has been named "Siver Sponsor" of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. BMC is a global leader in innovative software solutions that help businesses transform into digital enterprises for the ultimate competitive advantage. BMC Digital Enterprise Management is a set of innovative IT solutions designed to make digital business fast, seamless, and optimized from mainframe to mo...