Querying S3 with Presto This post assumes you have an AWS account and a Presto instance (standalone or cluster) running. We’ll use the Presto CLI to run the queries against the Yelp dataset. The dataset is a JSON dump of a subset of Yelp’s data for businesses, reviews, checkins, users and tips.
Configure Hive metastore Configure the Hive metastore to point at our data in S3. We are using the docker container inmobi/docker-hive