Integrations

Connect your AWS Redshift Database to Databox

Your SQL query results right on your mobile, big screen or PC

AvatarBoris Sagadinon September 27, 2016 (last modified on May 30, 2017) • 6 minute read

To start off, here’s what Amazon is saying about Redshift:

Amazon Redshiftis a fast, fully managed, petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools.

If you’d like to learn more about the tool itself, pleaseread this. Amazon Redshift takes its roots from a very popular open source database, PostgreSQL 8.0.2. It has a strong focus on scalability. Common setup is a clustered environment with a leader node. It follows aMPP (Massively Parallel Processing)architecture, which means that all operations are executed with as much parallelism as possible.

At the moment, MySQL, PostgreSQL, Microsoft Azure SQL and Amazon Redshift are supported out of the box on Databox. You have the data ready in your database, now it just needs to get visualized in an easy and concise manner so everyone – even your boss – can use.

Let’s get started!

What Will We Accomplish in This Tutorial?

Firstly, we’ll setup a new AWS Redshift cluster from scratch, then we’ll connect it to Databox and confirm that the connection is working. Lastly, we will create a Datacard visualizing the data from the cluster. All this without a single line of code, except for the SQL query.

1. Prepare Your Redshift Cluster

In this section, we will create a new AWS Redshift cluster step-by-step, add a user and setup network rules to allow access from our IP (52.4.198.118).

Login to AWS Console, then visit Services / Redshift and click on ‘Launch Cluster’ and fill-in the cluster details:

AWS Redshift Cluster Settings

Fill the form as needed; defaults are fine in this example. Pick a secure password.

Now choose the Node Type; for this example, we’ll use the weakest one:dc1.largeandSingle Node Cluster Type.

AWS Redshift Node Configuration

在下一个页面上,您的屏幕设置也将depend on your network setup; most of the defaults are fine for this example.

Review your settings and click on ‘Launch Cluster.’ Your cluster will take some time to build. When it’s ready, click on ‘Cluster Name’ and the cluster overview will be shown. Hostname to connect to will also be visible from Endpoint string:

In our example, hostname isredshift1.cssy86qcwxay.eu redshift.amazo——中央- 1.naws.com.

Your server should now be successfully set up to accept requests from our IP (52.4.198.118) to your Amazon Redshift cluster database, using your chosen user name and password. Go ahead and load some sample data and it’s ready for connecting to Databox.

2. Connect Your Database Cluster to Databox

The database cluster is now ready! The next step is to connect it and test it’s returning the data we need for our visualizations:

  • Log in to theDatabox web applicationand click on the ‘Data Manager’ tab,
  • Go to Available data sources option and find the AWS Redshift tile. At the time of this writing,Redshift is still in beta. You will need to check the Beta checkbox on the bottom right to see it,
AWS Redshift connect
  • Hover over it with your mouse and click the ‘Connect’ button that slides up into view
  • Enter your connection data in the popup and click the ‘Activate’ button. Default port 5439 is fine in most cases.
AWS Redshift Connect Popups
  • If all went well, the popup will close shortly and you’ll get a “connected!” message.

Great! You have justsuccessfully connected your database to Databox. In the next step, we’ll write a custom query that will regularly fetch data from your database and make it available for use in any Datacard.

Troubleshooting:If you get a “wrong credentials” message, double-check your user data. If you’re stuck on ‘Activate’ for a minute or so, it’s probably having issues connecting to your database host due to firewall / server / networking issues.

3. Visualize Your Data with Databox Designer

mysql_left

Now that the database is connected, we will use the Designer to query, shape and display the data in a format that’s most appropriate and useful for our needs:

  • Choose an existingDatacardor create a new one (how?)
  • ChooseDatablocksicon on the left
  • Drag & drop theTable blockonto your Datacard
  • For our example, where we will have a dynamic table (the pushed metric key has attributes/lines), we will switch to gather data from ‘Single metric’ in the properties panel on the right
mysql_right
  • Select your newly createdMy AWS Redshiftdata source from the Datasource dropdown on the right
  • Click on the Metric dropdown below Datasource and choose ‘Custom Metric from Query Builder
  • Write your SQL queryin a popup window that appears. For this example, we will write a basic query that returns the number of posts by authors in our database:
SELECT COUNT(p.ID) AS posts, u.display_name, p.date AS date FROM dbwp_users u, dbwp_posts p WHERE p.post_author = u.ID AND p.post_type = 'post' GROUP BY u.ID
  • Now click on ‘Show Data,’ below. Your query result should now be displayed at the bottom, similar to this, depending on your data, of course:
mysql_custom_query
  • (optional)You can rename each column (which will become a metric key in Databox), by clicking on the arrow beside it and typing in a new name.
  • (optional)You can enter a different metric key name pattern or just leaving the asterisk (*), which will create a metric key with the same name as pushed. By default the output (target datasource, where the data gets pushed to), is already selected and is the same as your source data connection. You can use other tokens if needed.
  • Once you are satisfied with the data you see, just click ‘Save Query.’
  • Tada! After you saved your custom query, you should see the data on the table. If not, check if the right data source and metric are selected. In our example it’s the ‘MyAWS Redshift’ data source and ‘└ posts|name’ metric, because we’re pushing posts by names. The time interval should be set to ‘Today,’ to see the latest data.

We have just written a custom SQL query and displayed its results. Databox will continuously,each hour, fetch data from this resource and store it in the selected target data source (in our example ‘MyAWS Redshift’).

Writing Queries Basics

Each query must contain a date column containing a valid date, nameddate. Let’s take a following SQL query for example:

SELECT salary_date AS date, salary FROM employees

In table employees we have a date column namedsalary_date. As Databox expects column with a namedate, we select oursalary_datecolumn as date.

薪水是另一个专栏中,containing a number, column name will be pushed as metric key namedsalary. This query is valid and can be pushed to Databox.

Troubleshooting:If you don’t see any data, double-check your SQL query, try it directly on your database. If it’s not displaying results there, you have an error somewhere in your query. Also check that the AWS Redshift user has necessary permissions to access the database from Databox IP.

Well done! Your AWS Redshift database is now connected to Databox, queries can be executed and then displayed on your mobile / big screen / computer.

AWS Redshift Mobile Dashboard

Go ahead and explore further. Add more queries, add blocks, explore different types of visualizations. Make that perfect Datacard (orDatawallof course) you always needed but didn’t know how to get. Now you can! Clean and professional, right at your fingertips. Only data that matters, without clutter. The possibilities are truly endless.

Ready to try it for yourself?Signup for free todayand let us know how it went for you.

Remember: we’re always glad to help if you run into any obstacles!

About the author
Avatar
Boris Sagadinis Databox's DevOps engineer. He's passionate about everything servers, redundancy, monitoring and security. In his free time he enjoys running, reading and traveling.

3 responses to “Connect your AWS Redshift Database to Databox”

  1. […] are hundreds of datasources that work out of the box, you can connect to any SQL database like AWS Redshift, MySQL… or bring your data from spreadsheets or custom built software behind your firewalled […]

  2. […] Our sample cloud data mart includes data from two organizational sources that update the information daily, Sales and Marketing. As each department’s data is updated from the source applications, the data mart will also be updated. We’re using the Databox-AzureSQL integration to connect to the data mart, then query and build data visualizations with this cloud data. (Databox provides a great connector for several other SQL databases too, including MySQL and AWS RedShift.) […]

  3. […] Connectors: SQL Databases including MySQL, PostgreSQL, Custom Microsoft Azure SQL & Amazon Redshift, and Custom API […]

You may also like...
Read more

New Integration: Connect Freshdesk with Databox

By connecting Freshdesk in Databox, you get access to 85 basic metrics, the ability to create custom ones in the Freshdesk Query Builder, and more.

Integrations| Dec 1 2021

Read more

New Integration: Track App Analytics from Your Mobile Apps with Appfigures

By connecting Appfigures in Databox, you can track and visualize 150+ metrics from your mobile apps from App Store Connect, Google Play, Amazon Appstore, and more.

Integrations| Nov 8 2021

Baidu