Thursday, December 1, 2022
HomeGoogleTired of searching Slack, GitHub, and Google Drive separately? Do All...

Tired of searching Slack, GitHub, and Google Drive separately? Do All This At Once In SQL

You know the drill: The words you’re looking for can be in Slack, or GitHub, or Google Drive, or Google Sheets, or Zendesk, or…the list goes on. Searching through these silos is a common frustration. It should be frictionless, and this Steam Hose dashboard makes it that way.

It wasn’t my first rodeo. I started this journey in 1996 and have revisited the idea periodically. In 2018 I wrote about a version that was the classic example of The Dopey Thing That Could Possible Work: a webpage that lists search URLs for various services and visits each in its own tab. As silly as it sounds, it was pretty useful to get used to a bit, and not just by me.

Of course I wanted to use the underlying APIs, normalize the results and merge them into a common view. But the effort required to rip out all the APIs made this project more difficult than it was worth. If you’ve done this sort of thing before, most services provide search APIs as well as adapters for your favorite programming language. But each service will have its own way of calling the API, paging the results, and formatting them. These differences create friction that you must overcome in order to work with results consistently.

Advertising

However, when API feuds become frictionless, a lot becomes possible. Effective metasearch is one of them. Steampipe allows you to exit the activity of calling APIs, paging results, and unpacking JSON objects. It calls APIs for you and streams the results to database tables so you can focus entirely on working with the data. This solves the biggest problem you face when building a metasearch dashboard.

Convergence on a schema

The next challenge is to tie the search results to a common schema. SQL is a great environment to do this. The query that drives the dashboard shown in the screencast includes three stanzas that you don’t need to be an SQL Wizard to write. They all follow the same pattern as this for finding GitHub issues.

select
   'github_issue' as type,
   repository_full_name || ' ' || title as source,
   to_char(created_at, 'YYYY-MM-DD') as date,
   html_url as link,
   substring(body from 1 for 200) || '...' as content
from
   github_search_issue
where
   $1 ~ 'github_issue'
   and query = 'in:body in:comments org:github ' || $2
   limit $3

The items in blue are the column names of a database table, in this case. github_search_issue, one of the tables made by Steampipe’s GitHub plugin. The Steampipe hub makes it easy to inspect table column names and descriptions, and shows you examples of how to use table information.

Because retrieving data doesn’t require API calls and JSON unpacking, you can focus on higher-order search syntax, which is a lot to think about, plus the interesting (and fun!) challenge of mapping the source columns to a common schema.

Items in red are the column names that appear in the dashboard. For this dashboard, we have decided that each search result will be mapped to these five columns: kind, source, date, linkand contents. SQL’s AS clause allows each stanza to easily rename its columns to match the schema.

The full query

Here is the full query that drives the dashboard. There are three stanzas like the one above, each written as a CTE (common table expression) with parameters corresponding to the input variables. And there is almost nothing else! Each stanza queries an API-based table (slack_search, github_search_issue, googleworkspace_drive_my_file), selects (and perhaps transforms) columns, then aliases the results to match the schema. All that’s left for UNION are the three CTEs, which act as temporary tables, and order the results.

with slack as (
  select
    'slack' as type,
    user_name || ' in #' || (channel ->> 'name')::text as source,
    to_char(timestamp, 'YYYY-MM-DD') as date,
    permalink as link,
    substring(text from 1 for 200) as content
  from
    slack_search
  where
    $1 ~ 'slack'
    and query = 'in:#steampipe after:${local.config.slack_date} ' || $2
  limit $3
),
github_issue as (
  select
    'github_issue' as type,
    repository_full_name || ' ' || title as source,
    to_char(created_at, 'YYYY-MM-DD') as date,
    html_url as link,
    substring(body from 1 for 200) || '...' as content
  from
    github_search_issue
  where
    $1 ~ 'github_issue'
    and query = ' in:body in:comments org:${local.config.github_org} ' || $2
  limit $3
),
gdrive as (
  select
    'gdrive' as type,
    replace(mime_type,'application/vnd.google-apps.','') as source,
    to_char(created_time, 'YYYY-MM-DD') as date,
    'https://docs.google.com/document/d/' || id as link,
    name as content
  from
    googleworkspace_drive_my_file
  where
    $1 ~ 'gdrive'
    and query = 'fullText contains ' || '''' || $2 || ''''
  limit $3
)

select * from slack
union 
select * from github_issue
union 
select * from gdrive

order by
  date desc

Dashboards as code

Many dashboard systems can work with this query. You could, for example, connect Metabase, tableau or another Postgres client to Steampipe and create the same kind of interactive dashboard as shown here. You would do this job in a low-code environment where widgets and settings are managed in a user interface. Vapor Pipes dashboard subsystem takes a different approach informed by its infrastructure as code (IaC) roots. API queries should be expressed in managed SQL code, like any other code, in version-controlled repositories. Dashboard widgets that display the results of these queries must also be expressed in code, and in this case, the language is that of Terraform. HCL.

Here is the HCL definition of the metasearch dashboard. It declares three types of hall to block: Sources (multiple selection), search_term (text), and max_per_source (single selection, which is the default). You can do a lot more with the hall block: in particular, you can fill it with the results of an SQL query, as shown in the documentation. It’s not necessary here, however.

the table block uses the query defined above and sets the parameters passed to it. the wrap guarantees that columns with a lot of text will be readable.

dashboard "metasearch" {

  input "sources" {
    title = "sources"
    type = "multiselect"
    width = 2
    option "slack" {} 
    option "github_issue" {}
    option "gdrive" {}
  }  

  input "search_term" {
    type = "text"
    width = 2
    title = "search term"
  }

  input "max_per_source" {
    title = "max per source"
    width = 2
    option "2" {}
    option "5" {}
    option "10" {}   
    option "20" {}
  }  

  table {
    title = "search slack + github + gdrive"
    query = query.metasearch
    args = [
      self.input.sources,
      self.input.search_term,
      self.input.max_per_source
    ]
    column "source" {
      wrap = "all"
    }
    column "link" {
      wrap = "all"
    }
    column "content" {
      wrap = "all"
    }
  }

}

Again, there’s not much else to see here, and there shouldn’t be. Building dashboards as code shouldn’t require a lot of complex code, and it doesn’t.

No magic required

Just as you don’t need to be an SQL wizard to create new subqueries, you also don’t need to be an HCL wizard to add them to the dashboard. Do you want to add sources? There are dozens more plugins to choose from, with more added every month. Not all of them offer search, but many do, and they’re easy to find with (of course!) a Steampipe query.

select
  name
  html_url
from
  github_search_code
where
  query = 'search org:turbot org:francois2metz org:ellisvalentiner org:theapsgroup'
  and name ~ 'table'
  and name ~ 'search'
order by
  name

In the Sample Steam Pipes repo, we included the dashboard code shown here, along with an additional search stanza for Zendesk that we removed when our trial account expired. Have fun expanding this dashboard! If a search API you need isn’t already available, drop by our Slack Community and let us know. Someone might already be writing the plugin you need, or maybe you’d like to tackle it yourself. Each new plugin empowers anyone who can work with basic HCL plus SQL to use APIs like a pro and solve real problems.

Copyright © 2022 IDG Communications, Inc.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Tags