ranger
ranger copied to clipboard
Ranger-4217: RANGER-1837 (enabling ORC audit logs) does not work
What changes were proposed in this pull request?
Changes were made to fix functionality to store audit logs in orc format that was proposed in RANGER-1837 (https://reviews.apache.org/r/63552/diff/7/)
How was this patch tested?
The patch was tested for the hdfs plugin.
Following steps were carried out:
- In Namenode host, created spool directory and changed owner so that it can be read/write/execute for owner of the Service mkdir -p /var/log/hdfs/audit/staging/spool cd /var/log/hdfs/audit/staging chown hdfs:hadoop spool
Enabled AuditFileQueue via following params in ranger-hdfs-audit.xml xasecure.audit.destination.hdfs.batch.queuetype=filequeue xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hdfs/audit/staging/spool xasecure.audit.destination.hdfs.batch.filequeue.filespool.buffer.size=10000
2.Enable ORC file format for Ranger HDFS Audit in ranger-hdfs-audit.xml. xasecure.audit.destination.hdfs.filetype=orc
-
Provision to control the compression techniques for ORC format in ranger-hdfs-audit.xml. xasecure.audit.destination.hdfs.orc.compression=snappy
-
Buffer Size and Stripe Size of ORC file batch. Default is '10000' bytes and '100000' bytes respectively. xasecure.audit.destination.hdfs.orc.buffersize=10000 xasecure.audit.destination.hdfs.orc.stripesize=100000
-
Add ORC jars to plugin path: Plugins have orc-core, orc-shims and aircompressor dependencies missing. Manually added simlinks to the plugin classpath Work in progress to add these dependencies to the distro instead of manually adding it to ranger-hdfs-plugin-impl:
cd path/hadoop/lib/ranger-hdfs-plugin-impl ln -s jar_location/jars/orc-core-1.7.6.jar . ln -s jar_location/jars/orc-shims-1.7.6.jar . ln -s jar_location/jars/aircompressor-0.10.jar .
-
Restarted using hdfs stale config
-
Verify by creating hive table from orc data…
CREATE EXTERNAL TABLE ranger_audit_event_new( repoType int, repo string, reqUser string, evtTime string, access string, resource string, resType string, action string, result int, agent string, policy int, reason string, enforcer string, cliIP string, agentHost string, logType string, id string, seq_num int, event_count int, event_dur_ms int, tags string, additional_info string, cluster_name string ) STORED AS ORC LOCATION '/ranger/audit/hdfs/hdfs/20230414' TBLPROPERTIES ("orc.compress"="SNAPPY");
- select query displays audit log data stored in orc format correctly. select * from ranger_audit_event_new;
Can you close this as it's been already handled as a part of other commit.