10 Things I Hate About SQL

Ranging from the Petty to the Serious

Shaker Islam
8 min readFeb 1, 2021

Let’s clear the air quick— SQL is awesome. It’s the most consistently used query language for well over two decades. There’s a ton of love for SQL from thousands of analysts and developers around the world. And honestly, I probably wouldn’t even have a job if it weren’t for SQL simply based on how ubiquitous it is and how much it helps building products.

So before getting into my rant below, I want to say THANK YOU to the people who’ve worked on SQL and SQL systems. You all must have moved mountains to make it all happen, and I deeply appreciate it.

</appreciation>

<rant>

All that said… SQL can sometimes SUCK. Here’s a bunch of reasons why.

1. SQL Syntax is WAY too strict

Can you spot the errors in the following query?

cool_films as (
select
title,
"Description: " + description,
length,
release_year,
from film
where description like '%cool%'
where title like '%cool%'
),
select * from cool_films

How many did you find? Because there are five errors!

  1. There’s a missing with at the beginning for the common table expression
  2. Trailing comma after release_year,
  3. … and another trailing comma after the closing parens ),
  4. Double quotes don’t refer to strings, they refer to identifiers (usually column identifiers)
  5. You can’t use where twice; you have to use and if you want to specify more filtering conditions

I guarantee you most people who write SQL for work run into at least one of these errors once a week or more. These errors are incredibly frustrating and they pile onto what is already a high learning curve for SQL.

These syntax rules also don’t add much. Is the above SQL that difficult to read? Or even difficult to parse before executing? No! Let me run my query!

2. UPPERCASE IS JUST GREAT, IT’S SUPER CLEAR

IF YOU’RE LOOKING AT QUERY LIKE THIS:

SELECT
something
FROM a_table
WHERE im_not_yelled_at
ORDER BY my_sanity ASC

DO YOU REALLY FIND THIS MORE READABLE THAN:

select
something
from a_table
where im_not_yelled_at
order by my_sanity asc

WHO WANTS THEIR QUERIES TO YELL AT THEM?

HAVE I MADE MY POINT YET??

3. Aliases don’t work after the Select block

Suppose we have the following query:

select
title,
description,
(publish_year / 100) as century
from books
where century between 16 and 19

This will return an error because the alias century can’t be used in the where clause. Some SQL engines will allow you to use having here instead, but it’s not standard SQL.

The historical reason for this is that everything in the select block is supposed to happen last — meaning the SQL engine will process the line where century like between 16 and 19 first, and it’s going to be like, “huh, I haven’t seen this century before!”

I’m no expert on building SQL engines, but I don’t really understand why this has to be a rule. Is there a logic error or such a big performance hit that it’s impossible for SQL engines to look for aliases in the select clause before executing the where clause? Is doing a quick scan of some text truly that hard to do?

4. Select statements are not dynamic

When I’m iterating and in the middle of working on a query, I often want to do something like this:

withsomething_i_copy_pasted (
select
... 50 different columns
from a_big_table
),
-- probably many other CTEs with complicated joins...select
*, except column "foo", and format column "bar" better
from something_i_copy_pasted

In these cases, I end up having to copy/paste the select statements across multiple CTEs and modify them slightly. This ends up being a tedious task when I really want to change something small and see what the results look like.

5. Sharing queries & code is very difficult

The oft recommended way to share SQL query snippets is through Views. I’m sure some teams use Views correctly and don’t run into issues, but I haven’t seen one.

My main issue with Views is that they can’t take input variables. For instance, it’d be amazing to have a View where you can modify a specific where condition or change/remove a join, or be able to slot in a CTE somewhere. These types of adjustments are nearly impossible without some very complicated Stored Procedures, which are yet another common source of issues.

Unlike a programming language where I can often copy/paste things from Stack Overflow or use a package manager and utilize a library, in SQL I can do none of those things. There are no SQL “packages” that just do a cross join unnest(x), or do some gnarly window functions. Above all, there’s no built-in support for a collaborative codebase — most teams have to roll their own version of shared SQL snippets in Github. All this sucks!

6. No variables in ANSI SQL

Yes yes, variables exist in some SQL dialects, but they’re not standard.

Here’s an example where variables could be helpful:

select
title,
description,
(publish_year / 100) as century,
case
when century < 0 then (century * -1) || ' B.C.'
when century > 0 then century || ' A.D.'
as formatted_century
from books

Alas, aliases can’t be reused! This is another situation where I need CTE, or write some really ugly SQL. In my opinion, both are less than ideal.

7. Bad, cryptic errors

Every time I get an error from my SQL query, I want to put my head through a wall. Often times, the errors should be simple or easy to fix, but SQL error messages give little information on how.

Here’s a forced example:

select
first_name
from actor a
join actor b on a.actor_id = b.actor_id
#=> Query Error: error: column reference "first_name" is ambiguous

The error is clear on what’s wrong — there are two columns called first_name, and the engine doesn’t know which one you want. Makes sense! But would it be SO HARD to tell me how to fix this simple error?

What I really want is this:

select
first_name
from actor a
join actor b on a.actor_id = b.actor_id
#=> Query Error: Column reference "first_name" is ambiguous. Available columns are a.first_name, b.first_name.

Easy peasy! What about misspellings?

select
first_namee
from actor a
#=> Query Error: error: column "first_namee" does not exist-- nope! this is what i want:select
first_namee
from actor a
#=> Query Error: Column "first_namee" does not exist. Did you mean "first_name"?

Did someone say “syntax error”?

select
first_name
from actor a
order by last_name
where first_name = 'Brad'
#=> Query Error: error: syntax error at or near "where"-- "where" do i begin? no thanks! just tell me what the problem is:select
first_name
from actor a
order by last_name
where first_name = 'Brad'
#=> Query Error: Syntax error at or near "where". "where" should come before "order by"

I could go on and on.

There are many errors to sift through, and I’m sure some have ambiguous or unclear solutions, and I know some of you will shout “use an IDE!”, etc. etc.

(side note: most SQL IDE’s are terrible, but I digress…)

If SQL engines returned with helpful error messages in the first place, no one would need a workaround, or bug their coworkers, or furiously google for fixable issues. Better error messages would be the single biggest UX improvement of SQL that I can think of.

8. SQL is impossible to debug

In many programming languages, there’s built-in support for a dropping debug statements, stopping execution, and being able to poke around and see what your program is doing. This is immensely helpful whether you’re building something new or debugging a problem in your code.

In SQL, there’s no equivalent. Beginners usually don’t know what to do when they’re faced by query that “isn’t working” or — worse yet — a query that silently gives unexpected results and you didn’t know.

Over time, you build an intuition on how to manually “debug” a query, doing things like:

  • Starting with a limit 1 or where id = '123' to limit your results and iterate quickly
  • Doing a left join other on a = other.b where a = null to surface join issues
  • Running CTEs one by one so you know how the query is “building” from one CTE to the next

I’m sure there’s other methods of “debugging” out there. But these feel like workarounds to me. What I really want is a way to “step through” my query and see how the SQL engine is processing it, and what types of results it gets along the way.

Believe me, I get that building a debugger into a SQL engine would be ridiculously difficult. You’re usually not running your query directly on the database server — you’re sending your query over the network to be validated, parsed, and executed, often on huge datasets. You can’t quite “peek” into how a query is getting executed like you could in a programming language.

… unless there’s a genius out there who can figure it out! I don’t know what’s actually possible, but any progress here would be an immense help.

9. Locked into relational algebra in ANSI SQL

I have to give credit where credit’s due — relational algebra is great, and it’s really useful for a query language to describe how to slice and dice data.

Sometimes, though, I just want a for loop. Just let me drop into a lambda and modify some things here and there. In a similar vein to my other complaints, functions/lambdas do exist in some SQL dialects, but they’re not part of ANSI SQL.

As a result, when you’re locked into relational algebra, it can often feel like a square peg in a round hole. Funky joins, compounding window functions, complex sub-selects — all of which are valid solutions that can, at the same time, feel very “advanced” and sometimes even hacky, even for a seasoned SQL user.

In reality, I want SQL to be as easy as Excel, or as smooth as dropping into a REPL to dig around what’s possible. Users aren’t going to get their queries right from step one, and the nature of relational algebra is that you’re going to get your queries wrong A LOT before you get them right. It’s so demotivating to go through it, and in reality it doesn’t have to be this way.

10. SQL implementations are inconsistent

Of course, the most common complaint of all.

With PostgreSQL, MySQL, Spark SQL, Presto SQL, SQL Server, Oracle SQL, and probably many others… we just can’t get a break! Each of these SQL implementations have slight idiosyncrasies that make it very painful to transfer queries from one system to another.

This issue truly rears its head when you want to use Google or Stack Overflow for solutions. You can’t always search for “how do I do X in SQL”, because some of the answers may be in a specific SQL dialect and won’t help you. For someone new to SQL, this is brutal, and it can take a long time to resolve simple issues.

I don’t know what the root problem here is. There is indeed an accepted SQL Standard, but nonetheless there are many deviations from it. Perhaps if there was recognized implementation of the SQL Standard as a standalone SQL Parser library that any application can utilize, then the industry could converge on the standard naturally.

But alas, here we are.

My hope is that at least some of this resonated with you. Maybe if enough people feel strongly about what can improve, then SQL really will improve.

We spend so much time about the UX of web and mobile products, but not enough, IMO, on developer or analyst products. One day, though, one day.

Thanks for listening to my TED talk.

--

--

Shaker Islam

Sometimes serious. Mostly harmless. I write about tech and personal stuff. Being silly elsewhere on the interwebs @__smiz or on Clubhouse.