New IBM Watson Data Platform and Data Science Experience

Yesterday IBM announced a new IBM Watson Data Platform that combines the world’s fastest data ingestion engine touting speeds up to 100+GB/second with cloud data source, data science, and cognitive API services. IBM is also making IBM Watson Machine Learning Service more intuitive with a self-service interface.

Make data simple

According to Bob Picciano, Senior Vice President of IBM Analytics “Watson Data Platform applies cognitive assistance for creating machine learning models, making it far faster to get from data to insight. It also, provides one place to access machine learning services and languages, so that anyone, from an app developer to the Chief Data Officer, can collaborate seamlessly to make sense of data, ask better questions, and more effectively operationalize insight.”

ibm-watson

Back when IBM Watson beat Ken Jennings at Jeopardy, the power processor ran on 100 calculations, a million times per second. Now, Picciano said, it does 1 million calculations, a million times per second.

Unified User Experience

IBM Watson Data Platform nicely integrates IBM Cloud Data Sources, Data Science Experience, Watson APIs, and much more on the IBM cloud, Bluemix.

IBM Watson Data Platform

Persona-based capabilities within IBM Watson Data Platform enable a high level of collaboration across data scientists, data engineers, business analysts, and developers. The platform provides one collaborative environment for multiple roles. Within that environment, experiences designed for task-specific elements to further streamline connections and data discovery.

  • Business pros can ask questions and get valuable insights
  • Devs enjoy Watson plus third-party APIs to accelerate innovation
  • Data pros can easily manage, integrate and protect data
  • Data scientist and analytics pro productivity improves with seamless integrated Spark, Jupyter notebooks, RStudio, Shiny and more for big data analytics via the IBM Data Science Experience 

Another differentiating factor for IBM’s offerings is advancement of intelligent Industry Models. When this all gels together…it will be even more amazing.

IBM Data Science Experience

IBM’s Data Science Experience is a cloud, browser-based environment for creating and deploying machine learning models in a guided experience. Machine learning models authored in Jupyter notebooks can be imported from a growing community library, pre-built industry solutions, or created from scratch.

IBM’s investments in Apache Spark, Bluemix cloud, cognitive computing, and several innovations developed by IBM Research were evident while reviewing these solutions. Like other massive technology vendors embracing open source, IBM is leveraging an expanding, familiar analytics ecosystem of Spark SQL, Python, R, Java, and Scala.

Highlighted partners include:

  • Qubole – Enables users of the IBM Data Science Experience to process data using Spark on their choice of public cloud infrastructure
  • RStudio – enables the development of R packages and integrates existing tools for R, including Shiny and the new R interface for Apache Spark, sparklyr
  • Keen IO – provides a set of powerful APIs that allow data scientists to collect, analyze, and visualize events from anything connected to the internet

In the broader IBM cloud data platform architecture, I noted a fabulous project for global metadata management and collaboration using Apache Atlas that is worthy of a dedicated future article.

Quick Hands On Testing

Within minutes of logging into Bluemix for the very first time, I was able to load data, run R scripts on Spark using Jupyter and the lovely, seamlessly integrated R Studio with Shiny, visualize my data, create a database on dashDB, and simply play! Loyal readers know that I can’t see something like this and not dig in – thus @idigdata.

My immediate take-away was the Data Science Experience user experience (UX) was intuitive and just flowed naturally. Talking to the engineer in the expo hall confirmed UX is a top priority. Kudos to a job well-done. I have been learning Spark on other offerings and tripping around a bit. This has been the easiest solution yet.

Even in beta right now, I was immediately functional with advanced big data analytics technologies. I was able to spin up IBM cloud dashDB, IoT ingestion services, and had no trouble finding where I’d hook up to Watson Cognitive APIs for feeding data into an analytics solution. Embedding the Shiny visualizations are a roadmap item. I do need to explore a bit further on how IBM Watson Analytics fits in. Note I do plan to write another dedicated article on IBM Watson Analytics soon.

All in all, I love the direction. I’ll be staying on top of IBM analytics more closely now. Here are a few screen shots of my first few minutes playing. When I get a chance, I will write a longer solution review. Since it is tech industry conference crunch time, I am overstretched right now.

IBM Watson Data Platform

ibmdp2

IBM Watson DSX

ibmdsx2

ibmdsx1

ibm_dsx_library

IBM Watson APIs

Future Data Science Automation Assistance

Built on Apache Spark, IBM Watson Machine Learning will intelligently and automatically build models in the future using patented Cognitive Assistance to score and recommend the best model using a comprehensive set of industry optimized algorithms. (Notice an industry theme…that domain expertise is key.)

IBM Project Farcast

Project Farcast shown by IBM Research in the expo hall impressed me. It was the first time I saw predictive data prep being revealed. BeyondCore and other automated predictive offerings run automated predictive data prep in a black-box manner today.

IBM Research is tackling the art of predictive model data set prep where most of the predictive modeling time and effort happens.

Project Farcast was able to show me data prep options with anticipated predictive model lift. After selecting an option, those data prep steps would be automatically performed along with generation of an enhanced predictive model. WOW! I can’t wait to see how this capability gets cooked into IBM Watson Data Platform, Data Science Experience, Watson Analytics and other solutions.

Additional Resources

For more information or a free trial of IBM Watson Data Platform, Data Science Experience, Watson APIs, or Bluemix, check out the following resources.

I promise to share more news from this event. There was soooooo much innovation. I also saw an incredible immersive virtual reality data viz and scored a Raspberry Pi 3 to ramp up on IoT analytics. More to come…

Virtual Reality Data Viz