AWS Timestream is a fully managed, purpose built time series database that makes it easy to store and analyze trillions of data points per day (per AWS). Since it is a serverless platform that automatically scales up or down to adjust to capacity and performance, this makes it a great tool for use in a proof of concept or full production environment. You can implement a functioning solution in much less time than a fully managed standalone database platform without the additional maintenance overhead.
Timestream allows integration with commonly used services for data collection, visualization, and machine learning. You can send data to Timestream using AWS IoT Core, Amazon Kinesis, and Amazon MSK. You can also visualize data using Amazon QuickSight, Grafana, and business intelligence tools through JDBC. You can also use Amazon SageMaker with Timestream for machine learning.
We have used Timestream in conjunction with AWS IoT Core to pull IoT device data from MQTT topics placing that data in a Timestream database. There are several pieces that need to configured to successfully send data from your IoT Core MQTT topics to a Timestream database:
- The AWS Timestream database itself
- A table associated with the Timestream database
- A message routing rule created in AWS IoT Core that transfers the MQTT data to the Timestream database
- IAM role that grants table write permissions to the message routing rule
All of these steps can be done manually in the AWS console which is fine for one off environments or testing, but we prefer to automate as much customer infrastructure as possible using Terraform. This enables greater consistency and allows us to be much more efficient once we have established a block of re-usable code.
The code examples below are used in a module (timestream) that is used in conjunction with another module that automatically creates AWS IoT Thing objects and policies. This post is specific to AWS Timestream so this information will cover code located in the timestream module.
This piece below configures the Timestream database and associated table.
- Two Terraform resources are defined aws_timestreamwrite_database and aws_timestreamwrite_table
- In the table, we determine whether we want to enable magnetic store writes (magnetic store allows you to take in late arriving data with an earlier time stamp than current time. This timestamp may be outside of the memory store retention period).
- We also determine the period of days and hours we want data to be retained to magnetic storage and memory respectively. The values listed below are for a PoC, in a production environment you would most likely want to increase these values keeping in mind the increased cloud resource cost of doing so.
# Create the Timestream datbase using a name declared in a separate variable
resource "aws_timestreamwrite_database" "db" {
database_name = "${var.dbname}DB"
}
# Create the Timestream table with a default name of "main_table"
resource "aws_timestreamwrite_table" "main_table" {
database_name = aws_timestreamwrite_database.db.database_name
table_name = "main_table"
magnetic_store_write_properties {
enable_magnetic_store_writes = true
}
retention_properties {
magnetic_store_retention_period_in_days = 2
memory_store_retention_period_in_hours = 8
}
}
An AWS IAM role and policy also need to be defined. This policy allows the IoT Core messaging rule to send messages to Timestream. We are allowing the role to have timestream:WriteRecords and timestream:DescribeEndpoints access.
- WriteRecords allows us to write to the table
- DescribeEndpoints returns a list of available endpoints to make Timestream API calls against
- We dynamically pull the current AWS region and account ID using a data lookup
# Create Timestream records write policy
resource "aws_iam_policy" "db_policy" {
name = "db-policy"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = [
"timestream:WriteRecords",
]
Effect = "Allow"
Resource = "arn:aws:timestream:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:database/${var.dbname}DB/table/main_table"
},
{
Action = [
"timestream:DescribeEndpoints",
]
Effect = "Allow"
Resource = "*"
}
]
})
}
This last piece creates the IAM role that will contain the above policy and an aws_iam_role_attachment that attaches the policy to the role.
# Create the IAM role for Timestream records write
resource "aws_iam_role" "topic_role" {
name = "timestream_write"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "iot.amazonaws.com"
},
"Effect": "Allow"
}
]
}
EOF
}
# Attaching IAM policy to the Timestream role
resource "aws_iam_role_policy_attachment" "policy_attach" {
role = aws_iam_role.topic_role.name
policy_arn = aws_iam_policy.db_policy.arn
}
Now that the Timestream database, table and IAM role are defined, we now need to create the IoT Core messaging rule that sends the MQTT topic data Timestream.
- The name of the rule includes the variable used to name the Timestream database
- We write a select statement to determine the data we want to pull from the MQTT topic. The example below is selecting everything from the topic, but you may want to use a more targeted data query statement to cut down on query costs.
- The Timestream database and table properties are also defined in this rule. The IAM role detailed earlier is provided along with table dimensions. Dimensions represent the metadata attributes of a time series data point.
# Create MQTT topic message routing rule to Timestream table
resource "aws_iot_topic_rule" "datastreamrule" {
name = "${var.dbname}TopicToTimestream"
description = "Routes MQTT topic data to AWS Timestream table"
enabled = true
sql = "SELECT * FROM '${var.dbname}/${var.topic}/temp'"
sql_version = "2016-03-23"
# Timestream database properties
# IAM role that allows table writes is specified here
timestream {
database_name = "${var.dbname}DB"
table_name = "main_table"
role_arn = aws_iam_role.topic_role.arn
dimension {
name = "device"
value = "devid"
}
}
}
Once we run our Terraform code, we should have a fully functional Timestream database, table and message routing policy! Further steps after this would include sending your Timestream table data to a visualization platform such as Grafana.