php – What are the necessary and most important things, we should do at validation, if an web application gets the users input or parameters? – Education Career Blog

I am always thinking about validation in any kind on the webpage (PHP or ASP, it doesn’t matter), but never find a good and accurate answer.

For example, a I have some GET-Parameter, which defines a SQL query like DESC oder ASC. (SQL-Injection?)

Or I have a comment-function for user, where the data is also saved in a database.

Is it enought to check for HTML-tags inside the data? Should the validation done before adding it to the database or showing it on the page?

I am searching for the ToDo’s which should be always performed with any data given from “outside”.

Thanks.

,

Have a good idea of what you want from the user.

You want them to specify ascending/descending order? That’s an enumeration (or a boolean), not part of an SQL query:

$query = "SELECT ... ORDER BY field " . escape($_GET'sortOrder'); //wrong

This is wrong no matter how much you escape and sanitize their string, because this is not the way to validate an enumeration. Compare:

if ($_GET'sortOrder' == 'desc') {
    $ascending = false;
} else {
    $ascending = true;
}

if ($ascending) {
    ...
} else {
    ...
}

…which does not warrant a discussion of string escaping or SQL injection because all you want from the user is a yes/no (or ascending/descending) answer.

You want them to enter a comment? Why disallow HTML tags? What if the user wants to enter HTML code?

Again, what you want from them is, say, “a text… any text with a maximum length of 1024 characters*.” What does this have to do with SQL or injection? Nothing:

$text = $_POST'commentText';

if (mb_strlen($text, ENCODING) <= 1024) {
    //valid!
}

The value in the database should reflect what the user entered verbatim; not translated, not escaped. Say you’re stripping all HTML <tags> from the comment. What happens when you decide to send comments somewhere in JSON format? Do you strip JSON control characters as well? What about some other format? What happens if HTML introduces a tag called “:)”? Do you go around in your database stripping off smileys from all comments?

The answer is no, as you don’t want HTML-safe, JSON-safe, some-weird-format-with-smileys-safe input from the user. You want text that is at maximum 1024 characters. Check for that. Store that.

Now, the displaying part is trickier. In order to display:

<b>I like HTML "tags"

in HTML, you need to write something like:

&lt;b&gt;I like HTML &quot;tags&quot;

In JSON, you would do:

{ "I like HTML \"tags\" }

That is why you should use your language facilities to escape the data when you’re using it.

The same of course goes for SQL, which is why you should escape the data when using simple query functions like mysql_query() in PHP. (Parametrized queries, which you should really be using, on the other hand, need no escaping.)

Summary

Have a really good idea of what you want as the input, keeping in mind that you almost never need, say, “HTML-safe text.” Validate against that. Escape when required, meaning escape HTML as you send to the browser, SQL as you send to the database, and so on.


*: You should also define what a “character” means here. UTF-8, for example, may use multiple bytes to encode a code point. Does “character” mean “byte” or “Unicode code point”?

,

If you’re using PDO, be sure to use prepared statements – these clean the incoming data automatically.

If using the mysql_* functions, run each variable through mysql_real_escape_string first.

You can also do validation such as making sure the variable is one of an acceptable range:

$allowed_values = array('name', 'date', 'last_login')
if(in_array($v, $allowed_values)) {
    // now we can use the variable
}

,

You are talking about two kinds of data sanitation. One is about putting user-generated data in your database and the other is about putting user-generated data on your webpage. For the former you should follow adam’s suggestions. For the later you should look into htmlspecialchars.

Do not mix these two as they do two completely different things. For that purpose sanitation should only take place at the last moment. Use adam’s suggestion just before updating the database. Use htmlspecialchars just before echoing data. Do not use htmlspecialchars on data before adding it to the database.

You might also want to look around Stackoverflow, because this sort of question has been asked and answered countless times in the past.

Leave a Comment