Pure Data System: High Performance, High Availability, High Scalability

With the challenge of growing volume, velocity and variety of data need for fast business answers – using a multi-purpose system for all data workloads is often not the most cost effective or low risk approach, and definitely not the fastest to deploy. The new Pure Data System is optimized exclusively for delivering data services to today’s demanding applications.

The IBM PureData System for Analytics is an analytic appliance, purpose-built and standards-based data warehouse that integrates server, storage, database and advanced analytic capabilities into a single, easy-to-manage system. It is powered by Netezza technology, optimizes the performance of your data warehouse. Ready for big data and business intelligence, you can run complex algorithms in minutes, not hours. It is designed for rapid and deep analysis of data volumes scaling into the petabytes, it delivers insight never before thought possible, at a low cost of ownership.

IBM PureData System is:

  • Speed: 10-100x faster than traditional systems
  • Simplicity: Minimal administration and tuning
  • Scalability: Peta-scale user data capacity
  • Smart: High-performance advanced analytics

Some of the benefits of IBM PureData


  • There’s nothing to do. It’s an appliance

BI Developers & DBAs – faster delivery

  • No Configuration
  • No Physical modelling
  • No indexes
  • No tuning – out of the box performance
  • Data model agnostic

ETL Developers

  • No aggregate tables needed – simpler ETL logic
  • Faster load and transformation times

Business Analysts

  • Train of thought analysis – 10 tot 100x faster
  • True ad hoc queries – no tuning, no indexes
  • Ask complex queries against large datasets
  • Lower latency – load & query simultaneously
  • OnStream processing by 100’s of nodes

IBM Skill Sets

  • Visualization & Discovery:
    • BigSheets, Dashboard & Visualization
  • Application & Development:
    • Text Analytics, Pig, JAQL, Hive
  • Analytics Engines:
    • R, AQL, HIL
  • Workload Optimization:
    • ZooKeeper, Oozie, Pig, Hive, Big SQL, HCatalog
  • Integration:
    • Flume, Sqoop
  • Runtime, Data Store and File System:
    • MapReduce, HBase, HDFS