Continuous Compliance Solution Using Amazon Athena - Part II

In this blog, we will look at the continuous compliance solution by querying the AWS CloudTrail with Amazon Athena.

In part 1 of the continuous compliance solutions blog, we have learnt that Continuous compliance is the continuous monitoring of IT assets to ensure compliance with regulatory security benchmarks. It aims to shift IT teams away from reactively responding to audit requests and attacks and toward being proactive in their preparation for future threats and data reporting requirements. We also looked at the various AWS native compliance tools and a use case scenario that made use of these tools. In part 2 of this blog, we will learn how to use Amazon Athena for continuous compliance.

Querying the AWS CloudTrail using Amazon Athena

AWS CloudTrail

AWS CloudTrail is an AWS service that helps you enable governance, compliance, and operational and risk auditing of your AWS account. Actions taken by a user, role, or an AWS service are recorded as events in CloudTrail. Events include actions taken in the AWS Management Console, AWS Command Line Interface, and AWS SDKs and APIs.

Amazon Athena

Amazon Athena is an interactive query service that makes it easy to analyse data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL. Athena is serverless, so there is no infrastructure to setup or manage, and you can start analysing data immediately. Amazon Athena is ideal for quick, ad-hoc querying, it can also handle complex analysis, including large joins, window functions, and arrays.

How to Query CloudTrail Logs using Amzon Athena

  1. CloudTrail service is enabled, and the logs are stored in S3 Bucket
  2. Initially, Athena Table is created for querying the CloudTrail Logs from S3
  3. The below external table creates a query containing partition projection enabled, so if any additional data is added in the mentioned S3 path the data will automatically be pulled by the query.

CREATE EXTERNAL TABLE cloudtrail_logs (

    • eventversion STRING,
    • useridentity STRUCT < type: STRING,
    • principalid: STRING,
    • arn: STRING,
    • accountid: STRING,
    • invokedby: STRING,
    • accesskeyid: STRING,
    • userName: STRING,
    • sessioncontext: STRUCT < attributes: STRUCT < mfaauthenticated: STRING,
    • creationdate: STRING >,
    • sessionissuer: STRUCT < type: STRING,
    • principalId: STRING,
    • arn: STRING,
    • accountId: STRING,
    • userName: STRING >>>,
    • eventtime STRING,
    • eventsource STRING,
    • eventname STRING,
    • awsregion STRING,
    • sourceipaddress STRING,
    • useragent STRING,
    • errorcode STRING,
    • errormessage STRING,
    • requestparameters STRING,
    • responseelements STRING,
    • additionaleventdata STRING,
    • requestid STRING,
    • eventid STRING,
    • resources ARRAY < STRUCT < ARN: STRING,
    • accountId: STRING,
    • type: STRING >>,
    • eventtype STRING,
    • apiversion STRING,
    • readonly STRING,
    • recipientaccountid STRING,
    • serviceeventdetails STRING,
    • sharedeventid STRING
    • ) PARTITIONED BY (accountid string, day string, region string) ROW FORMAT SERDE
    • 'com.amazon.emr.hive.serde.CloudTrailSerde' STORED AS INPUTFORMAT
    • 'com.amazon.emr.cloudtrail.CloudTrailInputFormat' OUTPUTFORMAT
    • 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://cloudtrails-
    • logsevents/AWSLogs/' TBLPROPERTIES (
    • 'projection.enabled' = 'true',
    • 'projection.accountid.range' = '0,9999999999999',
    • 'projection.accountid.type' = 'integer',
    • 'projection.day.format' = 'yyyy/MM/dd',
    • 'projection.day.interval' = '1',
    • 'projection.day.interval.unit' = 'DAYS',
    • 'projection.day.range' = '2015/01/01,NOW',
    • 'projection.day.type' = 'date',
    • 'projection.region.type' = 'enum',
    • 'projection.region.values' = 'us-east-1, us-east-2',
    • 'storage.location.template' =
    • 's3:///AWSLogs/${accountid}/CloudTrail/${region}/${day}/'
    • )

Creation of Athena view from the external table to parse the data from the column user-identity

  • create or replace view cloudtrail_logs_vw as
  • select
  • eventversion,
  • useridentity.type as useridentity_type,
  • eventname,
  • useridentity.principalid as useridentity_principalid,
  • useridentity.arn as useridentity_arn,
  • useridentity.accountid as useridentity_accountid,
  • useridentity.invokedby as useridentity_invokedby,
  • useridentity.username as useridentity_username,
  • useridentity.sessioncontext as useridentity_sessioncontext,
  • eventtime,
  • eventsource,
  • awsregion,
  • sourceipaddress,
  • useragent,
  • errorcode,
  • errormessage,
  • requestparameters,
  • responseelements,
  • additionaleventdata,
  • requestid,
  • eventid,
  • resources,
  • eventtype,
  • apiversion,
  • readonly,
  • recipientaccountid,
  • serviceeventdetails,
  • sharedeventid,
  • accountid,
  • day,
  • region
  • from cloudtrail_logs

For the querying particular Event Source and Event Name, the Below Query is run and the desired result will be shown as mentioned in the below image

  • select *
  • from cloudtrail_logs_vw
  • where accountid in ('')
  • and region = 'us-east-1'
  • and day = '2022/05/04'
  • and eventsource = 's3.amazonaws.com'
  • and eventname = 'PutBucketPublicAccessBlock'

Output

Picture1
Picture2

If you have any questions or suggestions, please reach out to us at contactus@1cloudhub.com

Written by:  Sriram Narayanan and Umashankar N

Updated on June 17, 2022

Sharing is caring!

In Blog
Subscribe to our Newsletter1CloudHub