Difference between revisions of "Writing MySQL queries"
(→Grouping and summarizing information: Added link to The_GROUPing_pitfall) |
(Updated to use <source> tag instead ot the MySQL template) |
||
Line 3: | Line 3: | ||
=== Case === | === Case === | ||
Query text that references MySQL elements (basic query syntax, functions, etc) should be in all uppercase letters, while references to elements of a database structure should be in lowercase. For example, instead of | Query text that references MySQL elements (basic query syntax, functions, etc) should be in all uppercase letters, while references to elements of a database structure should be in lowercase. For example, instead of | ||
<source lang="MySQL"> | |||
select PANTS from tableofstuff Where date(teh_date) = Now() | |||
</source> | |||
you would write | |||
<source lang="MySQL"> | |||
SELECT pants FROM tableofstuff WHERE DATE(teh_date) = NOW() | |||
</source> | |||
=== Backticks === | === Backticks === | ||
When writing production queries, all references to database names, table names, column names, index names, etc should be enclosed in backticks. | When writing production queries, all references to database names, table names, column names, index names, etc should be enclosed in backticks. | ||
For example, instead of | For example, instead of | ||
<source lang="MySQL"> | |||
SELECT shit FROM creek | |||
</source> | |||
you would write | |||
<source lang="MySQL"> | |||
SELECT `shit` FROM `creek` | |||
</source> | |||
=== Newlines === | === Newlines === | ||
Add a newline before every major part of the query (FROM, JOINs, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT). i.e., instead of | Add a newline before every major part of the query (FROM, JOINs, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT). i.e., instead of | ||
FROM `mothers` | <source lang="MySQL"> | ||
WHERE `fat` | SELECT `butt_size`, `ugliness` FROM `mothers` WHERE `fat` = 'True' ORDER BY `ugliness` | ||
ORDER BY `ugliness` | </source> | ||
you would use | |||
<source lang="MySQL"> | |||
SELECT `butt_size`, `ugliness` | |||
FROM `mothers` | |||
WHERE `fat` = 'True' | |||
ORDER BY `ugliness` | |||
</source> | |||
=== Referencing table names === | === Referencing table names === | ||
MySQL allows you to reference column names without specifying the table name, as long as the column name is not ambiguous in your query. In other words, this: | MySQL allows you to reference column names without specifying the table name, as long as the column name is not ambiguous in your query. In other words, this: | ||
<source lang="MySQL"> | |||
SELECT `hat` FROM `thing1` JOIN `thing2` ON `something` = `somethingelse` | |||
</source> | |||
is a valid query as long as `hat` is a column in either `thing1` or `thing2`, and not both. The same goes for the columns `something` and `somethingelse` - they can only be in one of the tables involved in the query. | |||
However, the way you SHOULD write the query is like so: | However, the way you SHOULD write the query is like so: | ||
<source lang="MySQL"> | |||
SELECT `thing1`.`hat` FROM `thing1` JOIN `thing2` ON `thing2.`something` = `thing1`.`somethingelse` | |||
</source> | |||
In large queries, it is MUCH easier to debug/comprehend queries that tell you the table names of all columns. Also, if a `hat` column is added to the `thing2` table at some point in the future, the former query will produce an error, while the latter will continue working. | In large queries, it is MUCH easier to debug/comprehend queries that tell you the table names of all columns. Also, if a `hat` column is added to the `thing2` table at some point in the future, the former query will produce an error, while the latter will continue working. | ||
Line 29: | Line 68: | ||
However, in any other case, it is generally not a good idea in production code - when debugging large queries written by someone else, it's generally a lot easier to figure out where the 20 columns in a SELECT are coming from if the correct table name is used, so you don't have to figure out what "pl.name" is, or where "is.amount" might be in the database. | However, in any other case, it is generally not a good idea in production code - when debugging large queries written by someone else, it's generally a lot easier to figure out where the 20 columns in a SELECT are coming from if the correct table name is used, so you don't have to figure out what "pl.name" is, or where "is.amount" might be in the database. | ||
== Common traps == | == Common traps == |
Revision as of 15:46, 22 January 2010
Formatting
When writing queries for production (for use in code, or in reports) there are some steps you can take to make your queries easier to read.
Case
Query text that references MySQL elements (basic query syntax, functions, etc) should be in all uppercase letters, while references to elements of a database structure should be in lowercase. For example, instead of
select PANTS from tableofstuff Where date(teh_date) = Now()
you would write
SELECT pants FROM tableofstuff WHERE DATE(teh_date) = NOW()
Backticks
When writing production queries, all references to database names, table names, column names, index names, etc should be enclosed in backticks.
For example, instead of
SELECT shit FROM creek
you would write
SELECT `shit` FROM `creek`
Newlines
Add a newline before every major part of the query (FROM, JOINs, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT). i.e., instead of
SELECT `butt_size`, `ugliness` FROM `mothers` WHERE `fat` = 'True' ORDER BY `ugliness`
you would use
SELECT `butt_size`, `ugliness`
FROM `mothers`
WHERE `fat` = 'True'
ORDER BY `ugliness`
Referencing table names
MySQL allows you to reference column names without specifying the table name, as long as the column name is not ambiguous in your query. In other words, this:
SELECT `hat` FROM `thing1` JOIN `thing2` ON `something` = `somethingelse`
is a valid query as long as `hat` is a column in either `thing1` or `thing2`, and not both. The same goes for the columns `something` and `somethingelse` - they can only be in one of the tables involved in the query.
However, the way you SHOULD write the query is like so:
SELECT `thing1`.`hat` FROM `thing1` JOIN `thing2` ON `thing2.`something` = `thing1`.`somethingelse`
In large queries, it is MUCH easier to debug/comprehend queries that tell you the table names of all columns. Also, if a `hat` column is added to the `thing2` table at some point in the future, the former query will produce an error, while the latter will continue working.
Aliases
It is possible to give your tables aliases using the AS command, so that you can refer to them with an alternate name in a query.
If you are joining a table to itself, it is necessary to give the table an alias at least once.
However, in any other case, it is generally not a good idea in production code - when debugging large queries written by someone else, it's generally a lot easier to figure out where the 20 columns in a SELECT are coming from if the correct table name is used, so you don't have to figure out what "pl.name" is, or where "is.amount" might be in the database.
Common traps
Over the history of ITrack, there have been hundreds of bugs fixed in reports, as well as the software itself, that were the result of a logical error in a query.
By using what you know about the database structure, you should be able to write queries that you can guarantee will always return the correct number of results. Here are some tips:
Table joins
Many of the queries you write will involve 2 tables that have a 1-to-1 or 1-to-many relationship with each other.
This means that you have one table that holds a bunch of records, where each entry links to exactly 1 row in another table - for example, every record in an "inventory" table will have a relationship to a single "inventory type" record in another table.
You may know that inventory records have an inventory type, and you will probably have to write a query that joins them together at least a few times.
When that time comes, you may think to yourself "but how do I know which fields to use to join them together? What will I be joining on?"
Your first step should always be LOOK AT THE PRIMARY KEY. If you're joining the inventorytype table onto the inventory table, look at the primary key of the inventorytype table. That way, you will know what fields are the bare minimum that must be referenced in your query.
Grouping and summarizing information
This is such an important topic that I have written an entire doc on it.
Logical errors won't stop your query from running, but they will result in bad data. If you're writing queries, you need to familiarize yourself with The GROUPing pitfall.
Seriously. Logical errors in your queries will fuck you over. Pay attention.