When you're given complex SQL queries to write which involve joins across many tables and several nested conditions, then it can seem a little intimidating, especially if you're not too confident or experienced with SQL syntax. This article shows you how to write a complex SQL query by breaking it down into simple steps and building it back up from there.
Know what results you want to achieve
Firstly, begin with the end in mind and write down what the expected resultset will be from your SQL query. Know how many records you expect to retrieve and which tables need to be used to create the results that you need.
Understand the table relationships
The next step in writing a complicated SQL query is to have a good understanding of the different fields in the tables and the relationships between each of the fields. Are they one to one relationships or one to many relationships? An ER (entity/relationship) diagram is useful to have here as it will quickly show these relationships for you. Make sure you know the fields that are required for your joins.
Break your query up into manageable parts
Remember that you don't have to write the whole SQL query at once. Start by determining which join to use on two of your tables and then introduce other tables from there. Remember that complex queries are usually just a collection of simple SQL queries. Test the simpler queries at each stage to make sure that you're getting the results that you need.
The diagram below (from Steve Stedman) is really useful in this stage of your complex SQL statement planning.
Once you've created your query, remember to retest it every time that you add new tables. This is a check to make sure you're still getting the results that you require.
Ask if you need to aggregate/order the result set
If you need to group the result set by one or more columns then you'll probably need to use GROUP BY. You may also need to use this with an aggregate function such as COUNT, MAX, MIN, SUM or AVG. Also, if the order of the results is important then don't forget your ORDER and ASC/DESC at the end of the statement.
Make it easier to read
Indent the SQL statement, where needed, and add comments too, where necessary. Don't put comments in where they're not needed as too many comments will make the code less readable and make the SQL statement longer than it needs to be. Remember that the longer the SQL statement, the more bandwidth is used and the more time the statement will take to execute (although this is usually negligible.)
The SQL will also be easier to understand if everyone uses the same convention for table names (e.g. first_name, FirstName, firstName) and capitalises the SQL keywords too. (e.g. SELECT first_name from MEMBERS ).
Another thing to watch out for is not using * (e.g. SELECT * from MEMBERS). The query will be much easier to understand if the field names are given and using * also slows down the query when * is not needed.
Other things to consider
You may find it easier (but not necessarily better) to use a stored procedure for you query if it is extremely complex however there are a number of points to consider before you do:
- they are not usually portable if you move to another DBMS.
- they are notoriously difficult to test as there is minimal debugging.
- they decrease the load on the application server but increase the load on the database server.
You could also use simpler SQL statements and then loop through the results in your application code calling more SQL statements as and when required. I wouldn't recommend this route as it is a hack and will result in messy code. Best to practise with SQL and become proficient at it than employ a workaround.
Lastly, you may find it easier to use a GUI to design complicated SQL queries. Two options are available here:
- LibreOffice Base. (Create query in design view).
- Office Access. (Create query design).
Remember that the code generated by a GUI may need some tweaking, but it may prove to be a good starting point.