Amazon AWS Getting Serious about Analytics

Last week over 24,000 attendees gathered at Amazon Web Services (AWS) annual re:Invent conference. The event has significantly grown over 84% since 2014 when there was 13,000 participants. If you are still questioning the inevitable cloud shift, wake up, plan ahead or be disrupted. It is happening.

If you have too much data to send to the cloud, Amazon has a state-of-the-art solution for you…they will send over a big truck. I kid you not!

AWS Snowmobile is an exabyte-scale data transfer service used to move extremely large amounts of data to Amazon cloud.

You can transfer up to 100PB per Snowmobile, a 45-foot long ruggedized shipping container, pulled by a semi-trailer truck. Snowmobile is the step up from Snowball making moving massive volumes of data to the cloud, including video libraries, image repositories, or even a complete data center migration feasible.

Amazon AWS – Cloud Market Leader

Gartner AWS Cloud 2016

Amazon AWS, market leader in cloud computing, touts a market share estimated by Gartner to be 10 times bigger than its 14 closest competitors.

This year Amazon claimed a $13 billion revenue run rate with 55% year-over-year growth. Impressive. One of my peers at a competing cloud vendor dismissed those numbers. Ironically, outrageous cloud analytics user vanity metrics were shared at that competing vendor’s conference. Bottom line – vendors do tend to tell data stories. As analytics professionals, enjoy attempting to decipher fact from vanity metric fiction.

AWS growth

Amazon AWS CEO, Andy Jassy, also shared increased speeds of delivering innovation. In 2016, approximately three new capabilities were released each day. Those are showcased in official AWS blogs.

AWS releases

Get Ready for Cloud

When it comes to cloud born companies, numerous rapid releases are a game-changer. Out of nowhere a serious contender can be born and grow up within one short year. That is why you can’t ignore cloud today. You need to be ready for it and prepare for cloud billing models.

Good read: how to avoid cloud bill shock.

Rarely will the loss leader cloud offering include what you need to succeed. You will buy more than you estimate. Consider running risk simulation models just in case steep 20%+ price increases happen again as they did after Brexit in 2016.

Amazon AWS Analytics Story Unfolds

Amazon AWS is known for stellar cloud infrastructure and databases including Data Pipeline, EMR, Redshift, Aurora, RDS, DynamoDB, and ElastiCache. This year we finally see Amazon getting serious about analytics with Amazon QuickSight for self-service BI, Glue for ETLKinesis for streaming data, Machine Learning, several new Artificial Intelligence offerings. and IoT services.

AWS Offerings

AWS Services
AWS Services 12/6/2016

My personal favorite analytics announcement was new Amazon Athena query service for analyzing raw data sources stored in Amazon S3 described in the official blog. Now you can quickly query your data without having to setup and manage any servers or data warehouses. Just point to your data in Amazon S3, define the schema, and start querying using the built-in query editor.

Amazon Athena eliminates complex processes to extract, transform, and load the data (ETL).

Amazon Athena uses Presto with full standard SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, and Parquet. And, while Amazon Athena is ideal for quick, ad-hoc querying and integrates with Amazon QuickSight for easy visualization, it can handle complex analysis, including large joins, window functions, and arrays. Since Amazon Athena executes queries using compute resources in multiple Availability Zones and uses Amazon S3 as the underlying data store, it is highly available and durable with data redundantly stored across multiple facilities and multiple devices in each facility. COOL! I can’t wait to test this one.

Amazon Glue ETL

Another announcement that I signed up to check out is new Amazon Glue for ETL. AWS Glue simplifies and automates data discovery, conversion, mapping, and job scheduling tasks. It is designed to guide you through the process of moving and understanding data, preparing data for analytics, and loading that data into Amazon destinations.

AWS Glue is integrated with Amazon S3, Amazon RDS, and Amazon Redshift, and can connect to any JDBC-compliant data store. AWS Glue automatically crawls your data sources, identifies data formats, and then suggests schemas and transformations, so you don’t have to hand-code data flows. You can then edit  transformations, if necessary, using the tools and technologies you already know, such as Python, Spark, Git, your favorite integrated developer environment (IDE), and share them with other users. AWS Glue schedules your ETL jobs and provisions and scales all the infrastructure required so your ETL jobs run quickly and efficiently at any scale.

A visual workflow app called AWS Step Functions also looks intriguing for building apps with components of distributed applications and microservices without writing code.

Workflow

In the Artificial Intelligence area, AWS announced a family of services that provide cloud-native machine learning and deep learning technologies. Natural language understanding (NLU), automatic speech recognition (ASR), visual search and image recognition, text-to-speech (TTS), and machine learning (ML) technologies were unveiled.

  • Amazon Lex makes it easy to build sophisticated chatbots powered by Alexa
  • Amazon Rekognition provides deep learning-based image recognition
  • Amazon Polly turns text into lifelike speech, and Amazon Machine Learning allows you to quickly build smart ML applications

During the keynotes, the announcement that seemed to be the biggest crowd-pleaser was the addition of popular PostgreSQL compatibility for Amazon Aurora. Several posted event presentations are also worth reviewing.

For More Information

That concludes my quick event summary of Amazon AWS re:invent analytics highlights. I do confess that I have barely scratched the surface of what was announced last week. I’d love to share more as soon as I can catch up from the seasonal conference crush of news.

AWS re:Invent

Here are the specific event news links.

If you are interested in following Amazon AWS big data, analytics, data science, IoT and other offerings, check out the following resources.