MySQL Shortcuts

Doug Treadwell

If you read the source code of a large number of PHP web pages, you will likely notice that they have many things in common. One of those common things is the use of many long, hand-written SQL queries. The user registration page, for example, has a query that inserts the new user into the database. That query looks something like this:

	mysql_query('
		insert into users ( username, password, email, first_name, last_name)
		values ( "' . $_POST['username'] . '", "' . $_POST['password'] . '", "' 
			. $_POST['email'] . '", "' . $_POST['first_name'] . '", "' 
			. $_POST['last_name'] . 
	'")';

Then the user is sent an email address with a link to click on, and when they click it their account is activated. Then, if it is a social networking site, they might go to their profile page and enter some information about themselves. When they click "submit", there will be a query that looks something like this:

	mysql_query('
		insert into profiles ( user_id, introduction, favorite_activities, 
			favorite_movies, favorite_books, opinions)
		values ( "' . $_POST['user_id'] . '", "' . $_POST['introduction'] . '", "' .
			$_POST['favorite_activities'] . '", "' . $_POST['favorite_movies'] 
			. '", "' . $_POST['favorite_books'] . '", "' . $_POST['opinions'] . 
	'")';

A few days later, the user comes back and decides they want to start writing in their journal (or "blog"). So the user types their blog entry, clicks "submit", and the web page handles it with a query that looks something like this:

	mysql_query('insert into blogs ( user_id, title, body, date)
	values ( "' . $_POST['user_id'] . '", "' . $_POST['title'] . '", "' 
		. $_POST['body'] . '", "' . $_POST['date'] . 
	'")';

As the user explores all the features on the site, there are probably dozens of queries like this, but there is a better and easier way of doing the same thing. As you may have noticed, each of these queries does the same thing, but with different data. Whenever you find yourself doing the same thing more than once, that's a good sign you can probably write a function to simplify your work.

Each of these queries has the same structure. They begin by calling mysql_query with "insert into", followed by a list of columns in parenthesis. After the columns is the keyword "values" followed by a list of values to insert in parenthesis. When you encounter lists of things, that is usually a sign that you can use loops to handle them. Instead of writing a new hand-written query for every new thing, consider using a function similar to the following:

	function mysql_insert($table, $input) {
		foreach ( $input as $key => $value ) {
			$columns[] = $key;
			$values[]  = '"' . $value . '"';
		}
		$columns = implode(', ', $columns);
		$values  = implode(', ', $values);	

		mysql_query('insert into ' . $table . ' ( ' . $columns . ' ) 
			values ( ' . $values . ' )');
	}

The variable $table would contain the name of the table, and the variable $input would contain an associative array of the data you intend to insert into the database. In PHP you will typically get user input as an associative array from $_GET or $_POST. So if your user submitted a form like this:

	<form>
		<input type="text" name="username" /><br />
		<input type="text" name="password" /><br />
		<input type="text" name="email" /><br />
		<input type="text" name="first_name" /><br />
		<input type="text" name="last_name" /><br />
		<input type="submit" />
	</form>

Then your $_POST array would have an element $_POST['username'] that holds their username, $_POST['password'] that holds their password, $_POST['email'] that holds their email, and so on. This form gives you everything you need and nothing you don't. The names of the inputs are the same as the field names in the users table from my first example, so as the foreach loop iterates over the elements of the $_POST array, it will put the correct field names into the SQL query that it will build for you.

In the foreach loop the "key", or "index" of the associative array is broken away from its "value" using foreach ( $input as $key => $value ). By doing this the element $input['username'] is broken into two variables. The variable $key will be set to "username" and the variable $value will be set to whatever value was held by $input['username']. If we were to use foreach ( $input as $value ) instead, we would have the value of $input['username'] but we wouldn't know the name of the input, and therefore we wouldn't know what field name in the table to insert the data into.

Inside the foreach loop, $columns[] automatically creates a new element in an array every time you assign a value to it. So the first time you use $columns[] = 'a', you are setting $columns[0] to 'a'. The second time you use $columns[] = 'b', you are setting $columns[1] to 'b'. So as the foreach loop breaks apart each element in the associative array, it adds the "key" part and the "value" part to separate arrays. Because it adds both at the same time, they stay in the same order. So $columns[0] will contain "username" and $values[0] will contain "Bob", and $columns[1] will contain "password", and $values[1] will contain Bob's password, and so on.

After building these two separate arrays, we use the implode() function to combine them into one long string like you saw in the original query. The first parameter to the implode function is the glue that is used to stick the pieces together. So if we have an array called $names with two names in it, "Bob" and "Sam", then implode(' and ', $names) will give us a string containing "Bob and Sam". So our implode in the function above will take the array $columns that has the elements "username", "password", "email", "first_name", and "last_name", and turn it into the string "username, password, email, first_name, last_name".

We do the same thing with the values, but if you notice, we put quotes around each of them as we built the array, so the string we make from that array will look something like: "MrBob", "bobspassword", "bob@smith.com", "Bob", "Smith". When these strings are combined with the rest of the query on the last line of the function, we get: insert into users (username, password, email, first_name, last_name ) values ( "MrBob", "bobspassword", "bob@smith.com", "Bob", "Smith" ). And yet, all we have to type to make this happen is: mysql_insert('users', $_POST).

Of course, if your form includes inputs that aren't meant to be put into the table, the function will not work without a few minor changes. One solution is to add a third parameter in which you provide a list of columns to update. For example:

	function mysql_insert($table, $input, $columns) {
		foreach ( $columns as $column ) {
			$values[]  = '"' . $input[$column] . '"';
		}
		$columns = implode(', ', $columns);
		$values  = implode(', ', $values);	

		mysql_query('insert into ' . $table . ' ( ' . $columns . ' ) 
			values ( ' . $values . ' )');
	}

Because $columns is already an array of the columns we want to insert data into, we can simply implode the $columns that the function was provided instead of creating the list again. However, providing this list of columns means more work. It is less work than writing the original query, but still more work than we need to do if we are willing to let the computer work a little bit harder instead.

	function mysql_insert($table, $input) {
		$show = mysql_query('show columns from ' . $table . ' where 'Key' != 'PRI');
		while ( $column = mysql_fetch_assoc($show) ) {
			if ( isset($input[$column['Field']]) ) {
				$columns[] = $column['Field'];
				$values[]  = '"' . $input[$column['Field']] . '"';
			}
		}
		$columns = implode(', ', $columns);
		$values  = implode(', ', $values);	

		mysql_query('insert into ' . $table . ' ( ' . $columns . ' ) values ( '
			. $values . ' )');
	}

This version asks MySQL for the list of columns defined for the table. Then, it loops through each of those columns to build the query. However, we probably do not want to try to insert a value into the primary key field. Usually MySQL will be doing that part for us, so we skip over the primary key field while building our query by making sure the column's Key attribute is not set to PRI. If for some reason you want to insert the primary key as well, simply remove the where clause in the show query.

Backticks ('') are required around Key in our query because Key is a reserved word in MySQL and we need to make sure it is treated as a column name instead. In this case, Key and Field are column names in a table that contains information about the column names in one of our other tables. Some other columns in this table are Type, Default, and Null. The information in these columns can also be useful, but not for what we are doing at the moment.

This function also goes a step further and does not insert any columns that it did not receive input for. In some cases we will want MySQL to use default values for input we have intentionally left out, so we don't want to insert empty strings into those columns. Empty strings ("") are not the same thing as being left out in MySQL, and default values will not be substituted if we insert an empty string. Empty strings are also considered different than non-existance in PHP. That is why we use the isset() function to check whether an $input value has been set at all, even to an empty string. If the user intended to input an empty string, we might want to allow them to in certain cases. If we had used the is_empty() function instead, users would not be able to insert empty strings or zeroes into any columns in the table, because PHP considers "" and 0 to be empty.

So far all we can do with this function is insert data. We also want to be able to update data we've inserted previously. To do that, we could use a function like this:

	function mysql_update($table, $input) {
		$show = mysql_query('show columns from ' . $table);
		while ( $column = mysql_fetch_assoc($show) ) {
			if ( $column['Key'] != 'PRI'
				&& isset($input[$column['Field']]) ) {
				$updates[] = $column['Field'] . ' = "' 
					. $_POST[$column['Field']] . '"';
			} else if ($column['Key'] == 'PRI' ) {
				$primary_key = $column['Field'];
			}
		}
		$updates = implode(', ', $updates);

		mysql_query('update ' . $table . ' set ' . $updates . ' . where . ' '
			. $primary_key . ' = "' . $input[$primary_key] . '"';
	}

In this case we ask MySQL for a list of all columns, including the primary key. Then we check each column to see if it is the primary key. In an update statement we need to know which column is the primary key so that we can target our where clause to update the correct record.

If you use the same name for the primary key in every table, such as "id", you could skip this step. Instead, you could build your query with every column where $column['Field'] is not "id" and then always use "where id =" in your update query. However, the way this function is written allows it to be used even when the primary keys are given different names on each table, without needing to tell the function which column it is.

Again, this function skips over any columns where the user provided no data. Remember, empty data such as zero (0) or an empty string ("") is still data. When isset() returns false for a variable from the $_POST or $_GET array, where our function's $input is probably going to come from, that means the user was not even provided an opportunity to enter information. For example, if the user is given a text box to enter their dog's name and they leave it blank because they do not have a dog, this function will update the associated column with an empty string. If on the other hand we did not give them this text box at all, isset() will return false for that variable.

Not updating columns where the user was unable to enter data allows for greater flexibility. You might want to let the user update a table in parts. Using this function, they can update one or two, or as many columns as you want them to. If we instead indiscriminately updated every column in the table, you would be forced to have them update everything at one time.

Automatically detecting which column is the primary key also allows you to use a few other handy functions. The two functions below allow you to select a record from a table and delete a record from a table without writing a full query.

	function mysql_select($table, $key) {
		$show = mysql_query('show columns from ' . $table . ' where 'Key' = 'PRI');
		$primary_key_column = mysql_fetch_assoc($show);
		$primary_key = $primary_key_column['Field'];
		$record = mysql_query('select * from ' . $table . ' where ' . 
		$primary_key . ' = "' . $key . '"');
		return $record;
	}

	function mysql_delete($table, $key) {
		$show = mysql_query('show columns from ' . $table . ' where 'Key' = 'PRI');
		$primary_key_column = mysql_fetch_assoc($show);
		$primary_key = $primary_key_column['Field'];
		mysql_query('delete from ' . $table . ' where ' . 
		$primary_key . ' = "' . $key . '"');
	}

Instead of writing mysql_query("select * from mytable where primary_key = 1") you could use mysql_select("mytable", "1"). Instead of mysql_query("delete from mytable where primary_key = 1") you could use mysql_delete("mytable", "1"). Sometimes though, you will want to do both at the same time. You may want to delete a record and then tell the user about the record you deleted so they feel confident that your program deleted the right one.

	function mysql_remove($table, $key) {
		$show = mysql_query('show columns from ' . $table . ' where 'Key' = 'PRI');
		$primary_key_column = mysql_fetch_assoc($show);
		$primary_key = $primary_key_column['Field'];
		$record = mysql_query('select * from ' . $table . ' where ' . 
		$primary_key . ' = "' . $key . '"');
		mysql_query('delete from ' . $table . ' where ' . 
		$primary_key . ' = "' . $key . '"');
		return $record;
	}

One last topic that would be prudent to cover is security. The functions I've shown you were built with $_POST or $_GET in mind, although they could take any input you choose to give them. One of my examples of usage was mysql_insert('users', $_POST). Normally that would be very insecure, because it would allow your users to inject their own commands into your SQL queries. However, there is a function I use to deal with this problem.

Typically programmers who are using mysql_query() are also using mysql_real_escape_string() to escape quotes and other dangerous characters that users may input. Rather than use mysql_real_escape_string repeatedly to make $_POST or $_GET content safe for SQL queries, I choose to make every element of those arrays safe automatically. At the top of your PHP page or in a separate file you include with every page, you could put the following function:

	function mysql_real_escape_array($array) {
		foreach ( $array as $key => $value ) {
			$new_array[$key] = mysql_real_escape_string($value);
		}
		return $new_array;
	}

In order to make all $_POST and $_GET elements safe, you can then simply type:

	$_POST = mysql_real_escape_array($_POST);
	$_GET  = mysql_real_escape_array($_GET);

One of the benefits of this function is that if you are working with less experienced programmers who have forgotten to make their queries secure, you do not need to rewrite every query they have written. You can simply include this function and those two lines and their queries become safe. If you need to use the unsafe data for some reason, such as displaying the original data for a user to edit in a textbox, you can then go out of your way to unslash() particular variables. It is much better to go out of your way to make variables unsafe than to worry about forgetting to make them safe.

Hopefully these functions, or modified versions of them, will be useful tools to add to your collection. I also hope the techniques shown will provide yet another way of approaching other similar problems in the future.