InfoSphere Streams: New Era of Computing

The volume of data available to organizations is expected to escalate tremendously and the opportunity for smart stream processing solutions to create new value will grow in proportion.

  • Volume: 12 terabytes of Tweets created daily.
  • Variety: 100’s video feeds from surveillance cameras
  • Velocity: 5 million trade events per second

The faster all of this data can be analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations.

To automate and incorporate streaming data into the decision-making process, you can use a new paradigm in programming called stream processing. Stream processing supports in harnessing the potential of data in motion. In traditional computing, you access relatively static information to answer evolving and dynamic analytic questions. With stream processing, you can deploy an application that continuously applies analysis to an ever-changing stream of data before it ever lands on disk.

Stream Computing enables organizations to:

  • Enhance existing models with new insights
  • Capture, analyze and act on insight before opportunities are lost, forever
  • Analyze and act up upon rapidly changing data in real time
  • Move from batching process to real-time analytics and decisions

Some of the use cases for Stream Computing are:

  • Real-time sentiment analysis of social media – Effectively respond to improve the client experience
  • Internet of Things – Optimize availability, performance, capacity and resource utilization
  • Next best action – Act on up-to-the-second observations, while the event/transaction is still happening
  • Enhanced security intelligence – Predict, prevent and act on security threats and real-time fraud detection. Increase situational awareness.

IBM Skill Sets

  • Visualization & Discovery:
    • BigSheets, Dashboard & Visualization
  • Application & Development:
    • Text Analytics, Pig, JAQL, Hive
  • Analytics Engines:
    • R, AQL, HIL
  • Workload Optimization:
    • ZooKeeper, Oozie, Pig, Hive, Big SQL, HCatalog
  • Integration:
    • Flume, Sqoop
  • Runtime, Data Store and File System:
    • MapReduce, HBase, HDFS