Getting RDS Logs for MySQL into Honeycomb

Amazon’s Relational Database Service (RDS) lets you use a number of databases without having to administer them yourself. The Honeycomb RDS connector gives you access to the same data as if you were running MySQL on your own server.

The Honeycomb RDS connector surfaces attributes like:

Honeycomb allows you to calculate metrics and statistics on the fly while retaining the full-resolution log lines (and the original MySQL query that started it all).

Once you’ve got data flowing, be sure to take a look at our starter queries; our entry points provide our recommendations for comparing lock retention by normalized query, scan efficiency by collection, or read vs. write distribution by host.

Note: Run the following commands from any Linux host with the appropriate AWS credentials to access the RDS API.

Before you run the RDS connector

Before running the RDS connector, configure MySQL running on RDS to output the slow query log to a file. Refer to Amazon’s documentation on setting Parameter Groups to get started, and find more detail about the configuration options below in the MySQL docs for the slow query log.

Set the following options in the Parameter Group:

If you switch to a new Parameter Group when you make these changes, make sure you restart the database.

Once you’ve made these changes, verify you are getting RDS logs via the RDS Console

Download the RDS connector (rdslogs)

rdslogs will stream the MySQL slow query log from RDS or download older log files. It can stream them to STDOUT or directly to Honeycomb. You can view the rdslogs source here.

Get and verify the current Linux version of rdslogs:

wget -q && \
      echo '4fed64ba4daa0e11b8e80ad4ad4133fd349f8632bd4cf894067a0e4dbfdf5eef  rdslogs_1.77_amd64.deb' | sha256sum -c && \
      sudo dpkg -i rdslogs_1.77_amd64.deb

Stream current logs to Honeycomb

Use the rdslogs command with the --output flag set to honeycomb to connect to RDS and send data from the current log to Honeycomb.

You will need the following information:

rdslogs \
    -i <instance-identifier> \
    --region=<region-code> \
    --output=honeycomb \
    --writekey=WRITEKEY \
    --dataset='RDS MySQL'

Use --sample_rate to send a subset (1/N log lines, defaults to N=1) of your data. Sampling in Honeycomb is described in detail in Sampling high volume data.

Scrub personally identifiable information

We believe strongly in the value of being able to track down the precise query causing a problem, but we also understand the concerns of exporting log data which may contain sensitive user information, so you have the option of hashing the contents of the data returned by a query.

To hash the concrete query, add the flag --scrub_query. The normalized_query attribute will still be representative of the shape of the query and identifying patterns (including specific queries) will still be possible, but the sensitive information will be completely obscured before leaving your servers.

For more information about dropping or scrubbing sensitive fields, see “Dropping or scrubbing fields” in the Agent documentation section.

Backfill existing logs

If you’re getting started with Honeycomb, you can load the past 24 hours of logs into Honeycomb to start finding interesting things right away. Launch this command to run in the background (it will take some time) while you hook up the live stream. (However, if you just now enabled the slow query log, you won’t have the past 24 hours of logs. You can skip this step and go straight to streaming.)

The following command will download all available slow query logs to a newly created slow_logs directory and then start up honeytail to send the parsed events to Honeycomb. You’ll need your RDS instance identifier (from the instances page of the RDS Console) and your Honeycomb write key (from your Honeycomb account page).

mkdir slow_logs && \
    rdslogs \
    -i <instance-identifier> \
    --download --download_dir=slow_logs && \
    honeytail \
    --writekey=WRITEKEY \
    --dataset='RDS MySQL' \
    --parser=mysql \
    --file='slow_logs/*' \

Once you’ve finished backfilling your old logs, we recommend transitioning to the default streaming behavior to stream current logs.