Safely use eval() on user-generated input

  • (6 Pages)
  • +
  • 1
  • 2
  • 3
  • Last »

76 Replies - 3104 Views - Last Post: 15 October 2020 - 08:01 PM Rate Topic: -----

#1 chris98   User is offline

  • D.I.C Lover

Reputation: 44
  • View blog
  • Posts: 1,161
  • Joined: 06-July 13

Safely use eval() on user-generated input

Posted 27 September 2020 - 03:31 PM

I recognise that many of you will react in horror at the title of this topic alone, but I've carefully examined the circumstances and personally, I can't see any other alternative than to use eval for what I'm trying to achieve. I'm open to hearing any alternatives, but my opinion is that eval is going to be most suited for what I need here.

Essentially, when I designed my site when I first started doing this, I added loads of columns into one database table. So users who are submitting data will submit for a certain game, each game has specific form elements for it, but rather than do anything about this, I literally just hard coded the changes into the form. I.e. one game might have a column of "balance" while another might have a column of "map type". Because the column numbers always matched, I just used the same columns in the database but populated it with the relevant data for each game.

When a certain game ID was used (again I had no way of checking dynamically, so I just hard coded each ID into the HTML) then I displayed different titles for the info provided.

So for instance, the data table before might have looked a little like this (table name changed for simplicity) :

data

id | name | game_id | category_id | description | balance | estates | difficulty .......

Essentially, the balance, estates and difficulty columns are what I described above. They contain different data based on what exactly is needed for the game they're submitting for, but this was never dynamic and was always hard-coded into the HTML form. It also made validation difficult, but I managed to do this to a degree.

What I've currently designed database-wise to fix this problem is the following tables (I've named these differently here for simplicity) :

fields

id | field_type | field_name | object_name | description | required | validation | custom

1 | text / radio / select etc | Difficulty | difficulty | The difficulty of this xyz | 1 | custom | function is_valid($game, $value) { return true; }

field_options

id | field_id (from fields table) | field_title | disp_position

1 | 1 | Very Easy Difficulty | 1

field_games

id | field_id (from fields table) | game_id

1 | 1 | 1

field_data

id | file_id (from data table) | field_id (from fields table) | file_value

1 | 1 | 1 | 1

So when the user selects the data on the form (for example on a dropdown box) they will select the text from the field_options, and the corresponding ID for that record will be that value. This ID is then passed to PHP, which will then check:

  • Is it valid data and have they tried to hack the HTML form, i.e. have they entered an INT if it's supposed to be, etc
  • Is it an actual field option, and is it valid for the game they are uploading for;


However I'd like to go further than this and make sure that it's within the correct ranges supplied in the database. Now this will obviously work for select boxes and radio buttons, because I can validate by ID, but how do I validate a text input? Each text input will obviously vary, so I feel to give me the flexibility I need, I need to be able to customise this per field.

What I have in mind is a single function, called is_valid. This function will return true if the input is valid, and false if it's invalid. It will also pass $game and $value, being the game ID and the user input respectively.

The admin can choose how to validate each field from a list, such as int, decimal etc, and "custom" is one of these options, for them to write this custom function. All that needs doing (I haven't tested this yet, so it may require slight tweaking) but all that will need doing is that eval will call this custom function written and stored for this field to verify user input. All the eval will do is declare this function. I'll use it later in actual code to do what I need to do with it.

The data validated might need to be in a certain format, match a regex, between two numbers or letters or whatever. Whatever it is, I'm confident it can be handled in the function I need, which will return true or false. All I need is this dynamic function, but it obviously can't be predicted because each field type is different.

So my real question, is that how can I ensure (ok, an admin) is entering valid syntax for what I want and not trying to exploit the system? The only place by the way this is editable from is within the Admin CP, but I recognise I still should obviously make sure it's sanitized properly for a whole variety of reasons. I'm not necessarily interested at this point whether the code is valid PHP or not, because later on I will call eval with it to make sure it actually works before saving it to the database. So that will pick up on any syntax errors, at this point my main focus is on making sure it's safe to use.

This is kind of the result for instance, that I'd be looking for from the form:

function is_valid($game, $value)
{
    if (strlen($value) > 1)
        return true;

    return false;
}



Is This A Good Question/Topic? 0
  • +

Replies To: Safely use eval() on user-generated input

#2 Splashsky   User is offline

  • D.I.C Addict
  • member icon

Reputation: 16
  • View blog
  • Posts: 629
  • Joined: 25-August 13

Re: Safely use eval() on user-generated input

Posted 27 September 2020 - 03:45 PM

Why don't you just make a validation function for each type of field? Regex matches for text, simple number checks for numerical inputs.
Was This Post Helpful? 0
  • +
  • -

#3 chris98   User is offline

  • D.I.C Lover

Reputation: 44
  • View blog
  • Posts: 1,161
  • Joined: 06-July 13

Re: Safely use eval() on user-generated input

Posted 27 September 2020 - 03:53 PM

The main problem with doing this really is that I might want to validate with more than just a regex, i.e. string functions as well, checking if it contains something and then doing something else to check, or maybe I might even need more than one regex etc.... in essence, it's down to flexibility.
Was This Post Helpful? 0
  • +
  • -

#4 Splashsky   User is offline

  • D.I.C Addict
  • member icon

Reputation: 16
  • View blog
  • Posts: 629
  • Joined: 25-August 13

Re: Safely use eval() on user-generated input

Posted 27 September 2020 - 03:56 PM

I get your desire for flexibility but you might run into an issue of over-engineering. What do you think a user might do aoutside of a simple check to make sure the input is valid? Shouldn't the function that handles the form input know what to do in any fringe cases?
Was This Post Helpful? 1
  • +
  • -

#5 chris98   User is offline

  • D.I.C Lover

Reputation: 44
  • View blog
  • Posts: 1,161
  • Joined: 06-July 13

Re: Safely use eval() on user-generated input

Posted 27 September 2020 - 04:04 PM

The honest answer is I'm not sure, but I just want to be prepared for any scenario that might come in the way in the future.

I haven't written the front-end of the system yet so again I couldn't tell you at this point in time, but my guess would be that it would fall back to nothing and probably display an error to them saying it was invalid input. Either that or set whatever the first value is that is actually valid.

I take your point about over-engineering though. It's a difficult balance I think.
Was This Post Helpful? 0
  • +
  • -

#6 Splashsky   User is offline

  • D.I.C Addict
  • member icon

Reputation: 16
  • View blog
  • Posts: 629
  • Joined: 25-August 13

Re: Safely use eval() on user-generated input

Posted 27 September 2020 - 04:11 PM

I may be able to offer a suggestion or two... What exactly is your project? Is it for DnD? A video game database?
Was This Post Helpful? 0
  • +
  • -

#7 ArtificialSoldier   User is offline

  • D.I.C Lover
  • member icon

Reputation: 2837
  • View blog
  • Posts: 8,206
  • Joined: 15-January 14

Re: Safely use eval() on user-generated input

Posted 27 September 2020 - 08:23 PM

If you let people enter PHP code then your server is going to exploited, period. Stop that idea and think about whatever other alternatives you have. Maybe you just have a big interface on a field to define all of the different checks you want to define, like type, length, value, regex, etc.
Was This Post Helpful? 3
  • +
  • -

#8 jon.kiparsky   User is offline

  • Beginner
  • member icon


Reputation: 12076
  • View blog
  • Posts: 20,477
  • Joined: 19-March 11

Re: Safely use eval() on user-generated input

Posted 27 September 2020 - 08:30 PM

Cutting to the chase, the idea of evaluating user-provided strings in a production environment is something that you simply should never consider - in any language, at any time, for any reason - since it ultimately amounts to allowing the user to execute arbitrary code against your server.

Trying to "validate" the input to filter out danger is exactly the wrong approach, as you'll realize if you consider Turing's famous result. Trying to programmatically determine the result of a given program (and this is what your "validation" is trying to do) is equivalent to trying to solve the halting problem, which is known to be unsolvable.

I have to admit that reading your description of the problem you're trying to solve caused my eyes to glaze over a bit, but it doesn't matter that much, since I can read this:

Quote

how can I ensure (ok, an admin) is entering valid syntax for what I want and not trying to exploit the system


and know that this is something that you can't do by machine. This is one of the most fundamental results in computer science.

Whatever it is you're trying to do can almost certainly be done, but if you value your sanity, do not try to do it by using eval on user-provided input.
Was This Post Helpful? 2
  • +
  • -

#9 chris98   User is offline

  • D.I.C Lover

Reputation: 44
  • View blog
  • Posts: 1,161
  • Joined: 06-July 13

Re: Safely use eval() on user-generated input

Posted 28 September 2020 - 03:59 AM

View PostSplashsky, on 28 September 2020 - 12:11 AM, said:

I may be able to offer a suggestion or two... What exactly is your project? Is it for DnD? A video game database?


I've sent you a PM outlining some more of the details of the project along with a link.... I try to keep things as anonymous as possible on the main forum.

In relation to what everyone else said: I mean this is just an early prototype. I don't necessarily know for certain what I need until I actually build the front of the system, this is just from planning that I feel I'm going to need this. I could try ignoring this problem, continuing to build the remainder of the system and then see what happens when the problem comes back up again.

I can sit on it for a while and see what happens, then report back and see whether we can figure out a more suited alternative.
Was This Post Helpful? 0
  • +
  • -

#10 andrewsw   User is offline

  • palpable absurdity
  • member icon

Reputation: 6920
  • View blog
  • Posts: 28,606
  • Joined: 12-December 12

Re: Safely use eval() on user-generated input

Posted 28 September 2020 - 04:14 AM

I think attempting to do this in an early prototype is worse than deferring the problem.
Was This Post Helpful? 2
  • +
  • -

#11 jon.kiparsky   User is offline

  • Beginner
  • member icon


Reputation: 12076
  • View blog
  • Posts: 20,477
  • Joined: 19-March 11

Re: Safely use eval() on user-generated input

Posted 28 September 2020 - 06:34 AM

View Postchris98, on 28 September 2020 - 05:59 AM, said:

In relation to what everyone else said: I mean this is just an early prototype.


Excellent. So you can easily stop doing it and figure out what you really need to do.

Again, it's very likely that you can accomplish your functional goals, but it's certain that giving the user power to execute arbitrary code on your server will cause you endless trouble. There is no better time to not do that than right now.
Was This Post Helpful? 1
  • +
  • -

#12 Ornstein   User is offline

  • D.I.C Regular

Reputation: 118
  • View blog
  • Posts: 273
  • Joined: 13-May 15

Re: Safely use eval() on user-generated input

Posted 28 September 2020 - 10:38 AM

As others have already said, there's probably a much cleaner approach to whatever you're ultimately trying to achieve.

Iiiiiiiiifffffffffff you do eventually decide that arbitrary code execution is your only option, you might be able to write something based on a package like PHP Parser to somewhat "sanitize" the code - where you might (for example) whitelist certain operations (especially which functions are allowed to be called) whilst being mindful of ways an attacker might try to work around said whitelisting (e.g. variable functions).

An obvious problem to flag up here is that if you only verify the code on input, there's always a potential attack vector of someone with DB access being able to run arbitrary code on the server - and if you """sanitize""" the code every time it's about to be executed, there will likely be a (significant) performance penalty.

You may be able to find a balance between overhead and security if you're somehow able to create a PHP sandbox environment in which to run the arbitrary code e.g. use something like exec to run a separate instance of PHP with its own open_basedir setting, disabled extensions, restricted file/user/group permissions, etc. (The more I think about it, this may turn out to be the better option anyway.)

If you're extra-paranoid and you can afford the overhead, you might try something like Javascript instead of PHP, run in a sandboxed Rhino instance for example.
Was This Post Helpful? 0
  • +
  • -

#13 ArtificialSoldier   User is offline

  • D.I.C Lover
  • member icon

Reputation: 2837
  • View blog
  • Posts: 8,206
  • Joined: 15-January 14

Re: Safely use eval() on user-generated input

Posted 28 September 2020 - 11:28 AM

Quote

Iiiiiiiiifffffffffff you do eventually decide that arbitrary code execution is your only option,

....then stop, and re-evaluate your design, unless you think you're smarter and know more about PHP than everyone trying to attack you. You're not going to be able to stop every trick that someone has to save an arbitrary file on your server and execute it, and if they can do that, they can do anything they want.
Was This Post Helpful? 2
  • +
  • -

#14 jon.kiparsky   User is offline

  • Beginner
  • member icon


Reputation: 12076
  • View blog
  • Posts: 20,477
  • Joined: 19-March 11

Re: Safely use eval() on user-generated input

Posted 28 September 2020 - 12:05 PM

View PostArtificialSoldier, on 28 September 2020 - 01:28 PM, said:

Quote

Iiiiiiiiifffffffffff you do eventually decide that arbitrary code execution is your only option,

....then stop, and re-evaluate your design, unless you think you're smarter and know more about PHP than everyone trying to attack you. You're not going to be able to stop every trick that someone has to save an arbitrary file on your server and execute it, and if they can do that, they can do anything they want.



And if you think you're smart enough to pull this off, then consider taking remedial CS courses because, once again, it's not a question of how smart you are.

Quote

As others have already said, there's probably a much cleaner approach to whatever you're ultimately trying to achieve.


Nor is it a matter of how clean your approach is.

Quote

if you """sanitize""" the code every time it's about to be executed, there will likely be a (significant) performance penalty.


Nor is it a matter of performance.

This really is first-year computer-science: there is no way to programmatically examine arbitrary code and ensure that it is "safe". It it literally more likely that you will prove P==NP than that you will solve this problem. (since P==NP is at least technically possible)

Trying to corral the code or sandbox it or restrict the allowable executable operations probably sounds like a fun and interesting problem, but the only way this will be safe is if you restrict PHP until it is no longer Turing-complete. At that point, you will be able to reason about the execution mechanically - but it will be much easier to just write a DSL to express exactly and only the things you want to express, and to parse input in that DSL. If indeed you really need to accept language-like input from the user, which frankly I have a really hard time conceiving of why you believe this is necessary.
Was This Post Helpful? 1
  • +
  • -

#15 Ornstein   User is offline

  • D.I.C Regular

Reputation: 118
  • View blog
  • Posts: 273
  • Joined: 13-May 15

Re: Safely use eval() on user-generated input

Posted 28 September 2020 - 02:16 PM

RE code analysis:

Given the OP's requirements specifically, we can know in advance what sort of things the "programmer" should and shouldn't need to be able to do - and in extreme cases, the input code could be altered where necessary to add additional runtime security - all of which is the important difference between the possible and the impossible here.

In PHP, restricting function calls alone will eliminate 99% of threats. The remaining 1% would be things like infinite loops, accessing/altering global variables, etc. - which I'm fairly sure could all be accounted for. (Loops are a good example of where you'd likely struggle with static analysis alone and need to inject some additional code - if you even wanted/needed to allow loops, that is.)

RE sandboxing:

There's an obvious point to make here: If this wasn't possible - or somehow crippled the language in the process - shared hosting and VPS providers and such, would be in a lot of trouble. ;)

I'm still not saying any of this should necessarily be done. That said, there's plenty of popular software out there with code editors built into the admin panel, for some purpose or another - so let's not pretend that idea itself is inconceivable.
Was This Post Helpful? 0
  • +
  • -

  • (6 Pages)
  • +
  • 1
  • 2
  • 3
  • Last »