In part 1 of the continuous compliance solutions blog, we have learnt that Continuous compliance is the continuous monitoring of IT assets to ensure compliance with regulatory security benchmarks. It aims to shift IT teams away from reactively responding to audit requests and attacks and toward being proactive in their preparation for future threats and data reporting requirements. We also looked at the various AWS native compliance tools and a use case scenario that made use of these tools. In part 2 of this blog, we will learn how to use Amazon Athena for continuous compliance.
Querying the AWS CloudTrail using Amazon Athena
AWS CloudTrail
AWS CloudTrail is an AWS service that helps you enable governance, compliance, and operational and risk auditing of your AWS account. Actions taken by a user, role, or an AWS service are recorded as events in CloudTrail. Events include actions taken in the AWS Management Console, AWS Command Line Interface, and AWS SDKs and APIs.
Amazon Athena
Amazon Athena is an interactive query service that makes it easy to analyse data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL. Athena is serverless, so there is no infrastructure to setup or manage, and you can start analysing data immediately. Amazon Athena is ideal for quick, ad-hoc querying, it can also handle complex analysis, including large joins, window functions, and arrays.
How to Query CloudTrail Logs using Amzon Athena
- CloudTrail service is enabled, and the logs are stored in S3 Bucket
- Initially, Athena Table is created for querying the CloudTrail Logs from S3
- The below external table creates a query containing partition projection enabled, so if any additional data is added in the mentioned S3 path the data will automatically be pulled by the query.
CREATE EXTERNAL TABLE cloudtrail_logs (
-
eventversion STRING,
useridentity STRUCT < type: STRING,
principalid: STRING,
arn: STRING,
accountid: STRING,
invokedby: STRING,
accesskeyid: STRING,
userName: STRING,
sessioncontext: STRUCT < attributes: STRUCT < mfaauthenticated: STRING,
creationdate: STRING >,
sessionissuer: STRUCT < type: STRING,
principalId: STRING,
arn: STRING,
accountId: STRING,
userName: STRING >>>,
eventtime STRING,
eventsource STRING,
eventname STRING,
awsregion STRING,
sourceipaddress STRING,
useragent STRING,
errorcode STRING,
errormessage STRING,
requestparameters STRING,
responseelements STRING,
additionaleventdata STRING,
requestid STRING,
eventid STRING,
resources ARRAY < STRUCT < ARN: STRING,
accountId: STRING,
type: STRING >>,
eventtype STRING,
apiversion STRING,
readonly STRING,
recipientaccountid STRING,
serviceeventdetails STRING,
sharedeventid STRING
) PARTITIONED BY (accountid string, day string, region string) ROW FORMAT SERDE
'com.amazon.emr.hive.serde.CloudTrailSerde' STORED AS INPUTFORMAT
'com.amazon.emr.cloudtrail.CloudTrailInputFormat' OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://cloudtrails-
logsevents/AWSLogs/' TBLPROPERTIES (
'projection.enabled' = 'true',
'projection.accountid.range' = '0,9999999999999',
'projection.accountid.type' = 'integer',
'projection.day.format' = 'yyyy/MM/dd',
'projection.day.interval' = '1',
'projection.day.interval.unit' = 'DAYS',
'projection.day.range' = '2015/01/01,NOW',
'projection.day.type' = 'date',
'projection.region.type' = 'enum',
'projection.region.values' = 'us-east-1, us-east-2',
'storage.location.template' =
's3:///AWSLogs/${accountid}/CloudTrail/${region}/${day}/'
)
Creation of Athena view from the external table to parse the data from the column user-identity
create or replace view cloudtrail_logs_vw as
select
eventversion,
useridentity.type as useridentity_type,
eventname,
useridentity.principalid as useridentity_principalid,
useridentity.arn as useridentity_arn,
useridentity.accountid as useridentity_accountid,
useridentity.invokedby as useridentity_invokedby,
useridentity.username as useridentity_username,
useridentity.sessioncontext as useridentity_sessioncontext,
eventtime,
eventsource,
awsregion,
sourceipaddress,
useragent,
errorcode,
errormessage,
requestparameters,
responseelements,
additionaleventdata,
requestid,
eventid,
resources,
eventtype,
apiversion,
readonly,
recipientaccountid,
serviceeventdetails,
sharedeventid,
accountid,
day,
region
from cloudtrail_logs
For the querying particular Event Source and Event Name, the Below Query is run and the desired result will be shown as mentioned in the below image
select *
from cloudtrail_logs_vw
where accountid in ('')
and region = 'us-east-1'
and day = '2022/05/04'
and eventsource = 's3.amazonaws.com'
and eventname = 'PutBucketPublicAccessBlock'
Output
If you have any questions or suggestions, please reach out to us at contactus@1cloudhub.com
Written by: Sriram Narayanan and Umashankar N