f



Embedding Python in Python #2

Anyone know a good way to embed python within python?

Now before you tell me that's silly, let me explain
what I'd like to do.

I'd like to allow user-defined scriptable objects.  I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

Any ideas/examples?

-Robey

0
robey (8)
8/18/2004 7:26:00 PM
comp.lang.python 77058 articles. 6 followers. Post Follow

23 Replies
1018 Views

Similar Articles

[PageSpeed] 30

You probably want something like this:

globalDict = {}
exec(stringOfPythonCodeFromUser, globalDict)

globalDict is now the global namespace of whatever was in
stringOfPythonCodeFromUser, so you can grab values from that and
selectivly import them into your namespace.

On Wed, Aug 18, 2004 at 02:26:00PM -0500, Robey Holderith wrote:
> 
> Anyone know a good way to embed python within python?
> 
> Now before you tell me that's silly, let me explain
> what I'd like to do.
> 
> I'd like to allow user-defined scriptable objects.  I'd
> like to give them access to modify pieces of my classes.
> I'd like to disallow access to pretty much the rest of
> the modules.
> 
> Any ideas/examples?
> 
> -Robey
0
indigo9502 (83)
8/18/2004 6:35:21 PM
Robey Holderith <robey@slash_dev_slash_random.org> writes:
> Anyone know a good way to embed python within python?

No.

> I'd like to allow user-defined scriptable objects.  I'd
> like to give them access to modify pieces of my classes.
> I'd like to disallow access to pretty much the rest of
> the modules.

There was a feature called rexec/Bastion for that purposes in older
version of Python, but it was removed because it was insecure.

> Any ideas/examples?

Run your sensitive stuff in a separate process (or separate computer)
and allow the hostile clients to communicate through sockets.
0
phr.cx (5493)
8/18/2004 6:37:57 PM
Paul Rubin <http://phr.cx@nospam.invalid> wrote:
....
> There was a feature called rexec/Bastion for that purposes in older
> version of Python, but it was removed because it was insecure.

>> Any ideas/examples?

> Run your sensitive stuff in a separate process (or separate computer)
> and allow the hostile clients to communicate through sockets.

If you're concerned about security, another possibility is to parse
the user's code and look for anything potentially dangerous.  You'll
need to be aggressive, but I believe it's possible.  For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc.  This is overly restrictive,
but it will provide security.
0
joshway (2)
8/18/2004 7:12:46 PM
No. An easy way to escape that is to start one's code with
'del __builtins__', then python will add the default __builtins__ back
to the namespace. Restricting what arbitrary code can do has been
discussed many, many times, and it seems there is no way to do it short
of reimplementing a python interpretor.

On Wed, Aug 18, 2004 at 02:56:04PM -0500, Robey Holderith wrote:
> So using this (with a little additional reading) it looks like I
> can do this:
> 
> globalDict = {'__builtins__': <my modules here>}
> exec(<pythonCodeFromUser>, globalDict)
> 
> And that this will disallow both importing of new modules and direct
> access to my namespace.  It will however allow access to the
> 
> Would this be secure?
> 
> Paul, what's your take on this?
> 
> -Robey
>
> On Wed, 18 Aug 2004 14:35:21 -0400, Phil Frost wrote:
> 
> > You probably want something like this:
> > 
> > globalDict = {}
> > exec(stringOfPythonCodeFromUser, globalDict)
> > 
> > globalDict is now the global namespace of whatever was in
> > stringOfPythonCodeFromUser, so you can grab values from that and
> > selectivly import them into your namespace.
> > 
> > On Wed, Aug 18, 2004 at 02:26:00PM -0500, Robey Holderith wrote:
> >> 
> >> Anyone know a good way to embed python within python?
> >> 
> >> Now before you tell me that's silly, let me explain
> >> what I'd like to do.
> >> 
> >> I'd like to allow user-defined scriptable objects.  I'd
> >> like to give them access to modify pieces of my classes.
> >> I'd like to disallow access to pretty much the rest of
> >> the modules.
> >> 
> >> Any ideas/examples?
> >> 
> >> -Robey
0
indigo9502 (83)
8/18/2004 7:27:50 PM
Robey Holderith <robey@slash_dev_slash_random.org> writes:
> Would this be secure?

No.

> Paul, what's your take on this?

Don't count on it.
0
phr.cx (5493)
8/18/2004 7:31:12 PM
JCM <joshway@myway.com> writes:
> If you're concerned about security, another possibility is to parse
> the user's code and look for anything potentially dangerous.  You'll
> need to be aggressive, but I believe it's possible.  For example,
> disallow exec statements, the identifier "eval", any identifier of
> __this__ form, import statements, etc.  This is overly restrictive,
> but it will provide security.

By the time you're done with all that, you may as well design a new
restricted language and interpret just that.

Hint: 
  e = vars()['__builtins__'].eval
  print e('2+2')

Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.
0
phr.cx (5493)
8/18/2004 7:35:39 PM
Paul Rubin <http://phr.cx@nospam.invalid> wrote:
> JCM <joshway_without_spam@myway.com> writes:
>> If you're concerned about security, another possibility is to parse
>> the user's code and look for anything potentially dangerous.  You'll
>> need to be aggressive, but I believe it's possible.  For example,
>> disallow exec statements, the identifier "eval", any identifier of
>> __this__ form, import statements, etc.  This is overly restrictive,
>> but it will provide security.

> By the time you're done with all that, you may as well design a new
> restricted language and interpret just that.

> Hint: 
>   e = vars()['__builtins__'].eval
>   print e('2+2')

> Even Java keeps getting new holes found, and Python is not anywhere
> near Java when it comes to this kind of thing.

I don't think it's as difficult as you think.  Your snippet of code
would be rejected by the rules I suggested.  You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.
0
8/18/2004 7:44:47 PM
On Wed, 18 Aug 2004 14:35:21 -0400, Phil Frost wrote:

> You probably want something like this:
> 
> globalDict = {}
> exec(stringOfPythonCodeFromUser, globalDict)
> 
> globalDict is now the global namespace of whatever was in
> stringOfPythonCodeFromUser, so you can grab values from that and
> selectivly import them into your namespace.
> 

So using this (with a little additional reading) it looks like I
can do this:

globalDict = {'__builtins__': <my modules here>}
exec(<pythonCodeFromUser>, globalDict)

And that this will disallow both importing of new modules and direct
access to my namespace.  It will however allow access to the

Would this be secure?

Paul, what's your take on this?

-Robey

> On Wed, Aug 18, 2004 at 02:26:00PM -0500, Robey Holderith wrote:
>> 
>> Anyone know a good way to embed python within python?
>> 
>> Now before you tell me that's silly, let me explain
>> what I'd like to do.
>> 
>> I'd like to allow user-defined scriptable objects.  I'd
>> like to give them access to modify pieces of my classes.
>> I'd like to disallow access to pretty much the rest of
>> the modules.
>> 
>> Any ideas/examples?
>> 
>> -Robey


0
robey (8)
8/18/2004 7:56:04 PM
On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote:
> Paul Rubin <http://phr.cx@nospam.invalid> wrote:
> > JCM <joshway_without_spam@myway.com> writes:
> >> If you're concerned about security, another possibility is to parse
> >> the user's code and look for anything potentially dangerous.  You'll
> >> need to be aggressive, but I believe it's possible.  For example,
> >> disallow exec statements, the identifier "eval", any identifier of
> >> __this__ form, import statements, etc.  This is overly restrictive,
> >> but it will provide security.
> 
> > By the time you're done with all that, you may as well design a new
> > restricted language and interpret just that.
> 
> > Hint: 
> >   e = vars()['__builtins__'].eval
> >   print e('2+2')
> 
> > Even Java keeps getting new holes found, and Python is not anywhere
> > near Java when it comes to this kind of thing.
> 
> I don't think it's as difficult as you think.  Your snippet of code
> would be rejected by the rules I suggested.  You'd also want to
> prohibit other builtins like compile, execfile, input, reload, vars,
> etc.
> 
foo = "ev" + "al"
e = vars()['__builtins__'].__dict__[foo]
print e('2+2')

This is a job for the operating system and not python.
Google groups for rexec and Bastion if you want to read ten lenghty
discussions of why this is the OS's job.

-Jack
0
jack6612 (128)
8/18/2004 8:06:34 PM
Jack Diederich <jack@performancedrivers.com> wrote:
> On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote:
....
>> I don't think it's as difficult as you think.  Your snippet of code
>> would be rejected by the rules I suggested.  You'd also want to
>> prohibit other builtins like compile, execfile, input, reload, vars,
>> etc.
>> 
> foo = "ev" + "al"
> e = vars()['__builtins__'].__dict__[foo]
> print e('2+2')

Also would be rejected by my original set of rules (can't use
__dict__).  But I'd disallow vars too.
0
8/18/2004 8:25:04 PM
Robey Holderith <robey@slash_dev_slash_random.org> wrote:
> On Wed, 18 Aug 2004 19:44:47 +0000, JCM wrote:
....
>> I don't think it's as difficult as you think.  Your snippet of code
>> would be rejected by the rules I suggested.  You'd also want to
>> prohibit other builtins like compile, execfile, input, reload, vars,
>> etc.

> I'm going to have to agree with Paul on this one.  I do not feel up to
> the task of thinking of every possible variant of malicious code.  There
> are far too many ways of writing the exact same thing.  I think it would
> be much easier to write my own interpreter. 

Well it certainly isn't easier to write your own interpreter if you're
talking about the effort you'd need to put into it.  And I'm not
convinced it's that tricky to come up with a set of syntax rules to
decide whether a piece of code is simple/safe enough to run.  It
basically comes down to disallowing certain statements and certain
identifiers.  Of course you'll end up rejecting a lot of code that
isn't malicious.

If you're interested enough, I'll try to throw a safety-checker
together.  You'd have to be pretty interested though (I'm lazy).
0
8/18/2004 8:33:09 PM
On Wed, 18 Aug 2004 19:44:47 +0000, JCM wrote:

> Paul Rubin <http://phr.cx@nospam.invalid> wrote:
>> JCM <joshway_without_spam@myway.com> writes:
>>> If you're concerned about security, another possibility is to parse
>>> the user's code and look for anything potentially dangerous.  You'll
>>> need to be aggressive, but I believe it's possible.  For example,
>>> disallow exec statements, the identifier "eval", any identifier of
>>> __this__ form, import statements, etc.  This is overly restrictive,
>>> but it will provide security.
> 
>> By the time you're done with all that, you may as well design a new
>> restricted language and interpret just that.
> 
>> Hint: 
>>   e = vars()['__builtins__'].eval
>>   print e('2+2')
> 
>> Even Java keeps getting new holes found, and Python is not anywhere
>> near Java when it comes to this kind of thing.
> 
> I don't think it's as difficult as you think.  Your snippet of code
> would be rejected by the rules I suggested.  You'd also want to
> prohibit other builtins like compile, execfile, input, reload, vars,
> etc.

I'm going to have to agree with Paul on this one.  I do not feel up to
the task of thinking of every possible variant of malicious code.  There
are far too many ways of writing the exact same thing.  I think it would
be much easier to write my own interpreter. 

-Robey


0
robey (8)
8/18/2004 9:03:27 PM
On Wed, Aug 18, 2004 at 08:25:04PM +0000, JCM wrote:
> Jack Diederich <jack@performancedrivers.com> wrote:
> > On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote:
> ...
> >> I don't think it's as difficult as you think.  Your snippet of code
> >> would be rejected by the rules I suggested.  You'd also want to
> >> prohibit other builtins like compile, execfile, input, reload, vars,
> >> etc.
> >> 
> > foo = "ev" + "al"
> > e = vars()['__builtins__'].__dict__[foo]
> > print e('2+2')
> 
> Also would be rejected by my original set of rules (can't use
> __dict__).  But I'd disallow vars too.

Google groups for this topic, it's been dead horse kicked.
You would have to eliminate getarr too and any C func that can
result in an infite loop.  

Not-python's-job-ly,

-Jack
0
jack6612 (128)
8/18/2004 9:10:46 PM
Jack Diederich <jack@performancedrivers.com> wrote:
> On Wed, Aug 18, 2004 at 08:25:04PM +0000, JCM wrote:
>> Jack Diederich <jack@performancedrivers.com> wrote:
>> > On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote:
>> ...
>> >> I don't think it's as difficult as you think.  Your snippet of code
>> >> would be rejected by the rules I suggested.  You'd also want to
>> >> prohibit other builtins like compile, execfile, input, reload, vars,
>> >> etc.
>> >> 
>> > foo = "ev" + "al"
>> > e = vars()['__builtins__'].__dict__[foo]
>> > print e('2+2')
>> 
>> Also would be rejected by my original set of rules (can't use
>> __dict__).  But I'd disallow vars too.

> Google groups for this topic, it's been dead horse kicked.
> You would have to eliminate getarr too and any C func that can
> result in an infite loop.  

Infinite loops (and other resource use) are a different story, not
addressed by source code inspection.  I worked on a project which
needed to run untrusted code, and we dealt with the infinite-loop
situation by always running untrusted code on the main thread and
signalling it if it took too long to execute (this worked on unix--I
don't know what you'd do on Windows).  I realize this could leave data
in a bad state.  Infinite loops are harder to deal with.
0
8/18/2004 9:36:44 PM
On Wed, 18 Aug 2004 15:27:50 -0400, Phil Frost wrote:

> No. An easy way to escape that is to start one's code with
> 'del __builtins__', then python will add the default __builtins__ back
> to the namespace. Restricting what arbitrary code can do has been
> discussed many, many times, and it seems there is no way to do it short
> of reimplementing a python interpretor.

Out of curiosity I tried the following in 2.3.4


#------Begin Code

import random

globalDict = {'__builtins__':random}
localDict  = {}
execfile("test2.py", globalDict, localDict)

print globalDict
print localDict

localDict['move']()

#------- End Code


Where test2.py looked like this:


#---------Begin Code

print __builtins__

try:
    del __builtins__
    print 'del worked'
except:
    pass

try:
    exec('del __builtins__')
    print('exec del worked')
except:
    pass

try:
    import sys
    print 'Import Worked'
except:
    pass

try:
    f = file('out.tmp','w')
    f.write('asdfasdf')
    f.close()
    print 'File Access Worked'
except:
    pass

seed()

def move():
    print __builtins__

#------ End Code

I sure it has a crack in in somewhere, but it doesn't
seem to be del __builtins__ .

-Robey





0
robey (8)
8/18/2004 9:48:26 PM
I've found the crack in the armor.  See additions below.

-Robey

On Wed, 18 Aug 2004 16:48:26 -0500, Robey Holderith wrote:
> 
> 
> Where test2.py looked like this:
> 
> 
> #---------Begin Code
> 
> print __builtins__
> 
> try:
>     del __builtins__
>     print 'del worked'
> except:
>     pass
> 
> try:
>     exec('del __builtins__')
>     print('exec del worked')
> except:
>     pass
> 
> try:
>     import sys
>     print 'Import Worked'
> except:
>     pass
> 
> try:
>     f = file('out.tmp','w')
>     f.write('asdfasdf')
>     f.close()
>     print 'File Access Worked'
> except:
>     pass
> 
> seed()
> 
> def move():
      #Add the following for a nice security hole
      global __builtins__
      del __builtins__ 
>     print __builtins__
> 
> #------ End Code
> 
> I sure it has a crack in in somewhere, but it doesn't
> seem to be del __builtins__ .
> 
> -Robey



0
robey (8)
8/18/2004 9:53:02 PM
JCM <joshway_without_spam@myway.com> writes:
> >> need to be aggressive, but I believe it's possible.  For example,
> >> disallow exec statements, the identifier "eval", any identifier of
> >> __this__ form, import statements, etc.  This is overly restrictive,
> >> but it will provide security.
> > Hint: 
> >   e = vars()['__builtins__'].eval
> >   print e('2+2')
> 
> I don't think it's as difficult as you think.  Your snippet of code
> would be rejected by the rules I suggested.  You'd also want to
> prohibit other builtins like compile, execfile, input, reload, vars, etc.

I don't see how.  Your rules were to disallow:

  1) exec statements.  My example doesn't use it.

  2) eval identifier.  My example uses eval as an attribute and not an
     identifier.  You can eliminate the use of eval as an attribute with
       e = getattr(vars()('__builtins__'), 'ev'+'al').
     Now not even the string 'eval' appears in one piece.
  3) identifiers like __this__.  My example doesn't use any.  It
     uses a constant string of that form, not an identifier.  The
     string could be computed instead, like the eval example above.
  4) import statements.  My example doesn't use them.

Conclusion, my example gets past your suggested rules.  I also didn't
use compile, execfile, input, or reload.  I did use vars but there are
probably other ways to do the same thing.  You can't take something
full of holes and start plugging holes until you think you found them
all.  You have to start with something that has no holes.  The Python
crowd has been through this many times already; do some searches for
rexec/Bastion security.
0
phr.cx (5493)
8/18/2004 9:58:11 PM
In article <pan.2004.08.18.19.25.59.519570@slash_dev_slash_random.org>,
 Robey Holderith <robey@slash_dev_slash_random.org> wrote:

> Anyone know a good way to embed python within python?

    >>> help(eval)

;)

-M

-- 
Michael J. Fromberger             | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/  | Dartmouth College, Hanover, NH, USA
0
8/18/2004 9:59:08 PM
On Wed, 18 Aug 2004 20:33:09 +0000, JCM wrote:

> Robey Holderith <robey@slash_dev_slash_random.org> wrote:
>> On Wed, 18 Aug 2004 19:44:47 +0000, JCM wrote:
> ...
>>> I don't think it's as difficult as you think.  Your snippet of code
>>> would be rejected by the rules I suggested.  You'd also want to
>>> prohibit other builtins like compile, execfile, input, reload, vars,
>>> etc.
> 
>> I'm going to have to agree with Paul on this one.  I do not feel up to
>> the task of thinking of every possible variant of malicious code.  There
>> are far too many ways of writing the exact same thing.  I think it would
>> be much easier to write my own interpreter. 
> 
> Well it certainly isn't easier to write your own interpreter if you're
> talking about the effort you'd need to put into it.  And I'm not
> convinced it's that tricky to come up with a set of syntax rules to
> decide whether a piece of code is simple/safe enough to run.  It
> basically comes down to disallowing certain statements and certain
> identifiers.  Of course you'll end up rejecting a lot of code that
> isn't malicious.
> 
> If you're interested enough, I'll try to throw a safety-checker
> together.  You'd have to be pretty interested though (I'm lazy).


Don't do it on my behalf.  I started far too many projects doing something
similar before I realized that the only effective way to do security was
from the bottom up.  The problem looks something like this (assuming each
function has 10 places where it is implemented.

Level   |  Malicious Variation Count
-----------------------------------------
0       |    10^0
1       |    10^1
2       |    10^2
x       |    10^x

Suffice to say that in simple code... it is doable.  In a
mature interpreter... near impossible.

-Robey


0
robey (8)
8/18/2004 10:05:40 PM
Robey Holderith <robey@slash_dev_slash_random.org> writes:

> Anyone know a good way to embed python within python?
> 
> Now before you tell me that's silly, let me explain
> what I'd like to do.
> 
> I'd like to allow user-defined scriptable objects.  I'd
> like to give them access to modify pieces of my classes.
> I'd like to disallow access to pretty much the rest of
> the modules.
> 
> Any ideas/examples?

use the rexec module, or see how Zope does it


Klaus Schilling
0
8/19/2004 7:21:08 AM
Well it seems that this is impossible to do with the current Python. But 
  it is a feature that would be important for certain applications. 
Actually I've been searching for this, too - and only found 
abandoned/deprecated modules.

If you want to use the current Python interpreter to execute the code, 
you'd have to remove many language features, because they could provide 
a backdoor for malicous code. This could be done by defining a grammar 
for a subset of Python (perhaps with some semantic checks), and verify 
that the code satisfies the grammar before you feed it into eval(). This 
could either be easy (resulting in a small subset of Python that is 
probably too small for real use...), or difficult (resulting in a usable 
subset, but with a large amount of complex grammar rules - with at least 
one rule that introduces a security leak...).

A good solution has to be implemented in the Python interpreter. Are 
there any plans for future versions of Python? I've seen the phrase 
"security initiative" on this list. Was that a "there is a ..." or 
"there should be a ..."? I couldn't find anything on the web (but didn't 
search very deep).

My first idea:

- extend the C-API (alternative to Py_Initialize??) for embedding Python 
to provide a 'stripped down' interpreter: no builtins with sideeffects 
(like open()...), ...
I don't know anything about Pythons internals or embedding Python, so I 
can say, if this is easy or possible at all.

- communication of the embedded script to the outside world (file or 
network I/O...) must be provided by the hosting application that is 
responsible for enforcing the desired security limitations.

- wrap it into a Python module. Then you can start the isolated embedded 
Python from 'real' Python code.

The interesting (and most difficult) thing is, which part of Pythons 
standard library relies on "dangerous" features. This could drastically 
reduce the usability of this approach (until you build your own 'secure' 
library).
Using this model, the secure interpreter is running in the same process 
context as the unsecure host. A bug in python could result in unchecked 
access to resources of the host. For higher security a separate process 
should be started.
0
b.niemann (44)
8/19/2004 9:27:31 AM
Paul Rubin <http://phr.cx@nospam.invalid> wrote:
....
>> > Hint: 
>> >   e = vars()['__builtins__'].eval
>> >   print e('2+2')
>> 
>> I don't think it's as difficult as you think.  Your snippet of code
>> would be rejected by the rules I suggested.  You'd also want to
>> prohibit other builtins like compile, execfile, input, reload, vars, etc.

> I don't see how.  Your rules were to disallow:

>   1) exec statements.  My example doesn't use it.

>   2) eval identifier.  My example uses eval as an attribute and not an
>      identifier.  You can eliminate the use of eval as an attribute with
>        e = getattr(vars()('__builtins__'), 'ev'+'al').
>      Now not even the string 'eval' appears in one piece.

You've used eval an as identifier (at least by the terminology to
which I'm accustomed), just not as a variable.

>   3) identifiers like __this__.  My example doesn't use any.  It
>      uses a constant string of that form, not an identifier.  The
>      string could be computed instead, like the eval example above.
>   4) import statements.  My example doesn't use them.

> Conclusion, my example gets past your suggested rules.  I also
> didn't use compile, execfile, input, or reload.  I did use vars but
> there are probably other ways to do the same thing.  You can't take
> something full of holes and start plugging holes until you think you
> found them all.  You have to start with something that has no holes.

It's fine to look at it that way.  Start with a subset of Python that
you know to be safe, for example only integer literal expressions.
Keep adding more safe features until you're satisfied with the
expressiveness of your subset.

> The Python crowd has been through this many times already; do some
> searches for rexec/Bastion security.

I did do a [quick] search, and saw a lot of articles about how rexec
and Bastion were insecure; but I didn't find any arguments about how
it's (too) difficult to come up with a safe subset of Python, for some
definition of "safe".
0
8/19/2004 1:00:25 PM
JCM <joshway_without_spam@myway.com> writes:
> It's fine to look at it that way.  Start with a subset of Python that
> you know to be safe, for example only integer literal expressions.
> Keep adding more safe features until you're satisfied with the
> expressiveness of your subset.

Well ok, but then you haven't got Python, you've got some subset, with
a completely different implementation than the Python that it's
embedded in.

0
phr.cx (5493)
8/19/2004 5:06:58 PM
Reply: