digitalmars.D - Walter did yo realy go Ohhhh?

PatrickD (5/5) Jun 15 2008 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

Walter Bright (10/21) Jun 15 2008 Steve and I did talk about VMs, and I did say that I thought they were a...

Jan Claeys (12/17) Jun 29 2008 The C ABI isn't really useful if you want (standardised) interoperabilit...

Nick Sabalausky (61/70) Jun 15 2008 I've never read much of Steve Yegge's stuff (in large part because I hav...

Nick Sabalausky (11/86) Jun 15 2008 One more thing: I also take issue with Steve's implication (somewhere in...
Robert Fraser (5/11) Jun 15 2008 By that argument, anything that a VM can do, native code should be able

Nick Sabalausky (9/20) Jun 15 2008 True, but I guess what I was trying to say was "How do VMs work from the...
Walter Bright (4/8) Jun 15 2008 Heck, there's nothing stopping one from writing a CPU instruction set

David Jeske (4/8) Jun 16 2008 It seems that security/verifiability, and ease of executing on an unknow...

Nick Sabalausky (34/46) Jun 17 2008 Thanks. Interesting read.

Yigal Chripun (13/71) Jun 17 2008 could you explain please why there's a need for a sandbox in the

Georg Wrede (4/6) Jun 18 2008 OS security protects the system and the other users from you.

Yigal Chripun (29/38) Jun 19 2008 I disagree. OS security can and does protect the user's files from code

Georg Wrede (32/69) Jun 19 2008 If the code that gets run "as the user" is malicious, and there are no
Nick Sabalausky (30/68) Jun 19 2008 If I understand all this right, it sounds like this is how it works with...

Yigal Chripun (23/101) Jun 20 2008 not exactly the same thing. the major difference is this: with

Nick Sabalausky (28/136) Jun 20 2008 Ok, I think I'm starting to get it, but I'm still a little fuzzy on some...

Yigal Chripun (44/190) Jun 20 2008 first a wikipedia link: http://en.wikipedia.org/wiki/Capabilities

Don (6/64) Jun 17 2008 It's quite unnecessary on an x86. The x86 has page protection

Nick Sabalausky (5/43) Jun 17 2008 What's the difference between that x86 page protection and whatever that...

Don (7/51) Jun 17 2008 The page protection is implemented by the OS, and only applies to user

Georg Wrede (20/34) Jun 18 2008 Gawhhhhh.

Georg Wrede (33/42) Jun 18 2008 I've read a number of his previous rants, and I've generally found them

Bill Baxter (13/35) Jun 18 2008 My guess is he's more accustomed to writing blogs than presenting them

Georg Wrede (28/65) Jun 19 2008 Let's hope it. (OT: it took me more than the promised 20 minutes to read...

Bill Baxter (8/28) Jun 19 2008 Or you could just use Java's VM instead of trying to figure out how to

Georg Wrede (5/33) Jun 19 2008 Hmm. I originally took it like he's promoting the VM as /itself/ having

Nick Sabalausky (9/16) Jun 19 2008 I could be wrong, but I got the impression that he, like a lot of
Walter Bright (4/7) Jun 19 2008 I see the advantage of a VM as being if you're inventing a new language,...

bearophile (4/6) Jun 20 2008 And often GC, part of the standard library, some/most external modules, ...

PatrickD <patrick.down gmail.com> writes:

http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

<Steve Yegge>
He told me the other day, [talking about] one of my blog rants, that he didn't
agree with the point that I'd made that virtual machines are "obvious". You
know? I mean, of course you use a virtual machine!

But he's a compiler dude, and he says they're a sham, they're a farce, "I don't
get it!" And so I explained it [my viewpoint] to him, and he went: Ohhhhhhh.
</Steve Yegge>

Jun 15 2008

Walter Bright <newshound1 digitalmars.com> writes:

PatrickD wrote:
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html
 
 <Steve Yegge> He told me the other day, [talking about] one of my
 blog rants, that he didn't agree with the point that I'd made that
 virtual machines are "obvious". You know? I mean, of course you use a
 virtual machine!
 
 But he's a compiler dude, and he says they're a sham, they're a
 farce, "I don't get it!" And so I explained it [my viewpoint] to him,
 and he went: Ohhhhhhh. </Steve Yegge>
 


Steve and I did talk about VMs, and I did say that I thought they were a 
sham. Maybe I did say Ohhhhh, but that was more of understanding his 
point of view than agreeing with it.

Steve also says in his blog that the advantage of VMs is language 
interoperability. I don't agree, since all you need for interoperability 
is an ABI. For compiled languages, the C ABI serves just fine, and as 
long as each language has a way to get at the C ABI, you have language 
interoperability.

Case in point - D!

Jun 15 2008

Jan Claeys <digitalmars janc.be> writes:

Op Sun, 15 Jun 2008 11:40:33 -0700, schreef Walter Bright:

 Steve also says in his blog that the advantage of VMs is language
 interoperability. I don't agree, since all you need for interoperability
 is an ABI. For compiled languages, the C ABI serves just fine, and as
 long as each language has a way to get at the C ABI, you have language
 interoperability.

The C ABI isn't really useful if you want (standardised) interoperability 
on an OO level.

That's where an OO machine design (be it a VM or a physical machine 
design--I don't care, and I don't see the difference for language 
designers) can be useful.

E.g., I've been wondering for some time what a hardware design like 
Linn's Rekursiv[*] could have meant to computer languages if it wouldn't 
have been killed because of various reasons (mostly financial...).


[*] <http://www.cpushack.net/CPU/cpu7.html>

-- 
JanC

Jun 29 2008

"Nick Sabalausky" <a a.a> writes:

"PatrickD" <patrick.down gmail.com> wrote in message 
news:g33e3g$pdd$1 digitalmars.com...
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

 <Steve Yegge>
 He told me the other day, [talking about] one of my blog rants, that he 
 didn't agree with the point that I'd made that virtual machines are 
 "obvious". You know? I mean, of course you use a virtual machine!

 But he's a compiler dude, and he says they're a sham, they're a farce, "I 
 don't get it!" And so I explained it [my viewpoint] to him, and he went: 
 Ohhhhhhh.
 </Steve Yegge>

I've never read much of Steve Yegge's stuff (in large part because I have 
better things to do than read though a book-sized blog post. "But I 
deliberately make them long because it's the opposite of everyone else and 
that makes mine stand out!" Yea, good for you, I don't care.) But after 
reading through the Intro, "FOO Chaos", "The right way to do unit testing", 
and "Static Typing's Paper Tigers", I'm now convinced Steve's full of shit.


FOO Chaos:
First he says "VMs are great for language interop", then he demonstrates 
that VMs *don't* solve the language interop issue. Ok, fine, then he scales 
back his claim and says "Well, they help!" Doesn't do much to convince me 
that VMs are "obvious".

But then, the whole idea of VMs being better for language interop is 
preposterous anyway. After all, how do VMs work? You take a 
high-level-language, compile it down to a sequence of pre-defined binary 
opcodes, and execute. Hey! Just like a real CPU! So if you can solve 
language interop on a VM, you can do the same thing to solve it for native 
code.

And what is that thing that "solves" it for VMs? (Oh that's right - it 
doesn't solve it, it merely *helps* it). A standard ABI, or at least 
something that basically boils down to a standard ABI. And that can't be 
done on native code...why?

So the strengths of VMs (and sure, there are some - but they're limited) do 
not lie in language interop.

And maybe I'm wrong, but I'd imagine that a bigger problem for language 
interop would be different languages for which there is no single machine 
target (native or VM) that they all have in common.


The right way to do unit testing:
"And [on a dynamically-typed language] when it works [for that mock data], 
you're like, "Oh yeah, it works!" You don't run it through a compiler. You 
copy it and paste it into your unit test suite. That's one unit test, right? 
And you copy it into your code, ok, this is your function."

Soo...He's advocating the strategy of assuming something works just because 
it worked for your mock data? Unit tests and regression tests catch one set 
of bugs, a good statically-typed compiler catches another set. The sets 
probably intersect, but one is not a superset of the other.

"To a large extent, especially in C++ and Java, the way you develop is: 
[step 1, 2, 3, etc.]
So it's this batch cycle, right? 1950s. Submit your punch cards, please."

I'm sure he didn't mean this as a serious argument, just a jab, but 
seriously, you could say the same thing about the scientific method. It's 
step-by-step a batch cycle too.


Static Typing's Paper Tigers:
"[Static Typing is a talisman that "keeps real tigers away". And I'm proving 
this by pointing out examples of big production systems written in 
dynamically-typed languages (While forgetting that VB and VB.NET code both 
supports and typically makes appropriate use of static typing)]"

Ok, so you *can* make big production systems in dynamically-typed languages. 
So what? You *can* also do it in Perl or Assembly. I don't think anyone 
disputes that. You can build a whole house using a coin to drive in all your 
screws. But that doesn't turn screwdrivers into proverbial tiger-dispelling 
talismans. The question is: During the course of those programs' development 
(and maintenance), how much time, effort and money did they spend chasing 
after things that a good statically-typed language would have immediately 
caught/prevented? Oh, is it *those* things that are the proverbial tigers? 
So just because big production systems *have* been made using those 
languages, that automatically implies that the developers *didn't* ever come 
across those problems and have to spend their time overcoming them?


I did agree with Steve on one thing though: "What are the odds that XML's 
going to wind up being less verbose than *anything*?"

Jun 15 2008

"Nick Sabalausky" <a a.a> writes:

"Nick Sabalausky" <a a.a> wrote in message 
news:g33t27$2dq$1 digitalmars.com...
 "PatrickD" <patrick.down gmail.com> wrote in message 
 news:g33e3g$pdd$1 digitalmars.com...
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

 <Steve Yegge>
 He told me the other day, [talking about] one of my blog rants, that he 
 didn't agree with the point that I'd made that virtual machines are 
 "obvious". You know? I mean, of course you use a virtual machine!

 But he's a compiler dude, and he says they're a sham, they're a farce, "I 
 don't get it!" And so I explained it [my viewpoint] to him, and he went: 
 Ohhhhhhh.
 </Steve Yegge>

 I've never read much of Steve Yegge's stuff (in large part because I have 
 better things to do than read though a book-sized blog post. "But I 
 deliberately make them long because it's the opposite of everyone else and 
 that makes mine stand out!" Yea, good for you, I don't care.) But after 
 reading through the Intro, "FOO Chaos", "The right way to do unit 
 testing", and "Static Typing's Paper Tigers", I'm now convinced Steve's 
 full of shit.


 FOO Chaos:
 First he says "VMs are great for language interop", then he demonstrates 
 that VMs *don't* solve the language interop issue. Ok, fine, then he 
 scales back his claim and says "Well, they help!" Doesn't do much to 
 convince me that VMs are "obvious".

 But then, the whole idea of VMs being better for language interop is 
 preposterous anyway. After all, how do VMs work? You take a 
 high-level-language, compile it down to a sequence of pre-defined binary 
 opcodes, and execute. Hey! Just like a real CPU! So if you can solve 
 language interop on a VM, you can do the same thing to solve it for native 
 code.

 And what is that thing that "solves" it for VMs? (Oh that's right - it 
 doesn't solve it, it merely *helps* it). A standard ABI, or at least 
 something that basically boils down to a standard ABI. And that can't be 
 done on native code...why?

 So the strengths of VMs (and sure, there are some - but they're limited) 
 do not lie in language interop.

 And maybe I'm wrong, but I'd imagine that a bigger problem for language 
 interop would be different languages for which there is no single machine 
 target (native or VM) that they all have in common.


 The right way to do unit testing:
 "And [on a dynamically-typed language] when it works [for that mock data], 
 you're like, "Oh yeah, it works!" You don't run it through a compiler. You 
 copy it and paste it into your unit test suite. That's one unit test, 
 right? And you copy it into your code, ok, this is your function."

 Soo...He's advocating the strategy of assuming something works just 
 because it worked for your mock data? Unit tests and regression tests 
 catch one set of bugs, a good statically-typed compiler catches another 
 set. The sets probably intersect, but one is not a superset of the other.

 "To a large extent, especially in C++ and Java, the way you develop is: 
 [step 1, 2, 3, etc.]
 So it's this batch cycle, right? 1950s. Submit your punch cards, please."

 I'm sure he didn't mean this as a serious argument, just a jab, but 
 seriously, you could say the same thing about the scientific method. It's 
 step-by-step a batch cycle too.


 Static Typing's Paper Tigers:
 "[Static Typing is a talisman that "keeps real tigers away". And I'm 
 proving this by pointing out examples of big production systems written in 
 dynamically-typed languages (While forgetting that VB and VB.NET code both 
 supports and typically makes appropriate use of static typing)]"

 Ok, so you *can* make big production systems in dynamically-typed 
 languages. So what? You *can* also do it in Perl or Assembly. I don't 
 think anyone disputes that. You can build a whole house using a coin to 
 drive in all your screws. But that doesn't turn screwdrivers into 
 proverbial tiger-dispelling talismans. The question is: During the course 
 of those programs' development (and maintenance), how much time, effort 
 and money did they spend chasing after things that a good statically-typed 
 language would have immediately caught/prevented? Oh, is it *those* things 
 that are the proverbial tigers? So just because big production systems 
 *have* been made using those languages, that automatically implies that 
 the developers *didn't* ever come across those problems and have to spend 
 their time overcoming them?


 I did agree with Steve on one thing though: "What are the odds that XML's 
 going to wind up being less verbose than *anything*?"

One more thing: I also take issue with Steve's implication (somewhere in 
that post, I can't find it in that haystack now), that you need VMs for 
runtime reflection. Umm...If debugging symbols/type-info can be injected 
into the executable and read/interpreted by a debugger at runtime, then they 
can me made readable by the program itself at runtime. (In fact, isn't there 
already a D library that enables runtime reflection by doing just that?) And 
if there's any needed info that's not in the injected debugging symbols, 
what's to stop the compiler/language-definition from just sticking it in the 
vtable, or something else akin to a vtable?

Jun 15 2008

Robert Fraser <fraserofthenight gmail.com> writes:

Nick Sabalausky wrote:
 But then, the whole idea of VMs being better for language interop is 
 preposterous anyway. After all, how do VMs work? You take a 
 high-level-language, compile it down to a sequence of pre-defined binary 
 opcodes, and execute. Hey! Just like a real CPU! So if you can solve 
 language interop on a VM, you can do the same thing to solve it for native 
 code.

By that argument, anything that a VM can do, native code should be able 
to do. This is kind of true, but to get some of those things (i.e. 
hot-swapping, security management, selective dynamic loading) working, 
you almost need to implement a mini-VM.

Jun 15 2008

"Nick Sabalausky" <a a.a> writes:

"Robert Fraser" <fraserofthenight gmail.com> wrote in message 
news:g346g3$ou7$1 digitalmars.com...
 Nick Sabalausky wrote:
 But then, the whole idea of VMs being better for language interop is 
 preposterous anyway. After all, how do VMs work? You take a 
 high-level-language, compile it down to a sequence of pre-defined binary 
 opcodes, and execute. Hey! Just like a real CPU! So if you can solve 
 language interop on a VM, you can do the same thing to solve it for 
 native code.

 By that argument, anything that a VM can do, native code should be able to 
 do. This is kind of true, but to get some of those things (i.e. 
 hot-swapping, security management, selective dynamic loading) working, you 
 almost need to implement a mini-VM.

True, but I guess what I was trying to say was "How do VMs work from the 
perspective of language interop?" From the security perspective, for 
instance, there are differences (With a VM, you can sanbox whatever you 
want, however you want, without requiring a physical CPU that supports the 
appropriate security features.) But for language interop it all just comes 
down to "standard ABI" regardless of whether it's a VM's machine code or a 
real CPU's machine code.

Jun 15 2008

Walter Bright <newshound1 digitalmars.com> writes:

Robert Fraser wrote:
 By that argument, anything that a VM can do, native code should be able 
 to do. This is kind of true, but to get some of those things (i.e. 
 hot-swapping, security management, selective dynamic loading) working, 
 you almost need to implement a mini-VM.


Heck, there's nothing stopping one from writing a CPU instruction set 
emulator and create a 'VM' to do it. There are only a couple hundred of 
rather simple instructions needed to be emulated.

Jun 15 2008

David Jeske <davidj gmail.com> writes:

Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences 
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.) 

It seems that security/verifiability, and ease of executing on an unknown
target processor are the two major benefits of a VM.

However, you might be interested in looking at software based fault isolation
if you have not seen it. It may make you reconsider how much you need a VM to
implement code security. There is a pretty simple explanation here:

  http://www.cs.unm.edu/~riesen/prop/node16.html

Jun 16 2008

"Nick Sabalausky" <a a.a> writes:

"David Jeske" <davidj gmail.com> wrote in message 
news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)

 It seems that security/verifiability, and ease of executing on an unknown 
 target processor are the two major benefits of a VM.

 However, you might be interested in looking at software based fault 
 isolation if you have not seen it. It may make you reconsider how much you 
 need a VM to implement code security. There is a pretty simple explanation 
 here:

  http://www.cs.unm.edu/~riesen/prop/node16.html


Thanks. Interesting read.

Although expanding *every* write/jump/(and maybe read) from one instruction 
each into five instructions each kinda makes me cringe (But maybe it 
wouldn't need to be a 1-to-5 on every single write/jump after some sort of 
optimizing-compiler-style magic?). I know that paper claims an overhead of 
only 4.3% (I wish it had a link to an online copy of the benchmark 
tests/results), but it was written ten years ago and, as I understand it, 
pipelining and cache concerns make a far larger speed difference today than 
they did back then. And, while I'm no x86 asm expert, what they're proposing 
strikes me as something that might be rather pipeline/cache-unfriendly.

Plus, maybe this has changed in recent years, but back when I was doing x86 
asm (also about ten or so years ago), the x86 had *very* few general-purpose 
registers. Like 4 or 5, IIRC. If that's still the case, that would just make 
performance worse since the 5-6 extra registers this paper suggests would 
turn into additional memory access (And I imagine they'd be cache-killing 
accesses). I'm not sure that they mean by i860, though, 
Intel-something-or-other probably, but I assume i860 isn't the same as 
i86/x86.

Granted, I know performance is a secondary, at best, concern for the types 
of situations where you would want a sandbox. But, I can't help thinking 
about rasterized drawing, video decompression, and other things Flash does, 
and wonder what Flash would be like if the browser placed the flash plugin 
(I mean the actual browser plugin, not an SWF) into this style of sandbox.

Of course, VMs have overhead too (though I doubt Flash's rendering is done 
in a VM), but I'm not up-to-date enough on all the modern VM implementation 
details to know how a modern VM's overhead would compare to this. Maybe I'm 
just confused, but I wonder if a just-in-time-compiled VM would have the 
potential to be faster than this, simply because the VM's bytecode 
(presumably) has no way of expressing unsafe behaviors, and therefore 
anything translated by the VM itself from that "safe" bytecode to real 
native code would not need those extra runtime checks. (Hmm, kinda weird to 
think of a VM potentially being *faster* than native code for something).

Jun 17 2008

Yigal Chripun <yigal100 gmail.com> writes:

Nick Sabalausky wrote:
 "David Jeske" <davidj gmail.com> wrote in message 
 news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)

 It seems that security/verifiability, and ease of executing on an unknown 
 target processor are the two major benefits of a VM.

 However, you might be interested in looking at software based fault 
 isolation if you have not seen it. It may make you reconsider how much you 
 need a VM to implement code security. There is a pretty simple explanation 
 here:

  http://www.cs.unm.edu/~riesen/prop/node16.html

 
 
 Thanks. Interesting read.
 
 Although expanding *every* write/jump/(and maybe read) from one instruction 
 each into five instructions each kinda makes me cringe (But maybe it 
 wouldn't need to be a 1-to-5 on every single write/jump after some sort of 
 optimizing-compiler-style magic?). I know that paper claims an overhead of 
 only 4.3% (I wish it had a link to an online copy of the benchmark 
 tests/results), but it was written ten years ago and, as I understand it, 
 pipelining and cache concerns make a far larger speed difference today than 
 they did back then. And, while I'm no x86 asm expert, what they're proposing 
 strikes me as something that might be rather pipeline/cache-unfriendly.
 
 Plus, maybe this has changed in recent years, but back when I was doing x86 
 asm (also about ten or so years ago), the x86 had *very* few general-purpose 
 registers. Like 4 or 5, IIRC. If that's still the case, that would just make 
 performance worse since the 5-6 extra registers this paper suggests would 
 turn into additional memory access (And I imagine they'd be cache-killing 
 accesses). I'm not sure that they mean by i860, though, 
 Intel-something-or-other probably, but I assume i860 isn't the same as 
 i86/x86.
 
 Granted, I know performance is a secondary, at best, concern for the types 
 of situations where you would want a sandbox. But, I can't help thinking 
 about rasterized drawing, video decompression, and other things Flash does, 
 and wonder what Flash would be like if the browser placed the flash plugin 
 (I mean the actual browser plugin, not an SWF) into this style of sandbox.
 
 Of course, VMs have overhead too (though I doubt Flash's rendering is done 
 in a VM), but I'm not up-to-date enough on all the modern VM implementation 
 details to know how a modern VM's overhead would compare to this. Maybe I'm 
 just confused, but I wonder if a just-in-time-compiled VM would have the 
 potential to be faster than this, simply because the VM's bytecode 
 (presumably) has no way of expressing unsafe behaviors, and therefore 
 anything translated by the VM itself from that "safe" bytecode to real 
 native code would not need those extra runtime checks. (Hmm, kinda weird to 
 think of a VM potentially being *faster* than native code for something).
 
 

could you explain please why there's a need for a sandbox in the
first-place?
I think that security should be enforced by the OS. On windows, I see
the need for external means of security like a VM since the OS doesn't
do security (Microsoft's sense of the word is to annoy the end user with
a message box, requiring him to press OK several times...)
But on other OSes that seems unnecessary since the OS provides ways to
manage security for code. linux has se-linux and there are newer OSes
developed with the concept of capabilities.
so, unless I'm on windows, what are the benefits of a VM that I won't
get directly from the OS?

--Yigal

Jun 17 2008

Georg Wrede <georg nospam.org> writes:

Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?

OS security protects the system and the other users from you.

A sandbox protects you yourself from code that's run "as you".

(That is, protects your files, etc.)

Jun 18 2008

Yigal Chripun <yigal100 gmail.com> writes:

Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?

 
 OS security protects the system and the other users from you.
 
 A sandbox protects you yourself from code that's run "as you".
 
 (That is, protects your files, etc.)

I disagree. OS security can and does protect the user's files from code
that's run "as the user" <-this is a bad concept.

current OSes use ACLs (windows, linux, etc..) and there's nothing
stopping you from defining a file to be read only, or non-executable to
protect data, and the current practice is to define "users" for deamons
in order to protect data. that's why apache runs with user www-data with
 its own ACL rules. you can achieve perfect security with this scheme if
you invest enough time to create a separate "user" for each process.
as an example, I can run my browser as a different limited user or use a
browser which runs inside a sandbox. I can get the same protection from
both but the sandbox solution has more overhead.

it's easy to see all the problems with manually defining ACLs.
Newer OSes based on the concept of "capabilities" remove all those
problems. such OSes give processes defined capabilities unrelated to any
concept of a user (the concept of users is defined on top of the
capabilities mechanism).
Capabilities are basically the same as OOP - simplified example:
currently OSes are written in a procedural way, there are global data
structures and global system calls. i.e. you print to screen via
Stdout(text); in D which just calls in the end the appropriate syscall.
in a capabilities based OS, there is no such global syscalls/functions.
you need to hold an output instance (a handle in the OS - a Capability)
in order to call its print method. only if the process has that instance
it can print to the screen. security is implemented via the explicit
passing of such instances. so if the program received an output
instance, it received the right to print to the screen.

No sandboxes/VMs/any other emulation layer is needed.

--Yigal

Jun 19 2008

Georg Wrede <georg nospam.org> writes:

Yigal Chripun wrote:
 Georg Wrede wrote:
Yigal Chripun wrote:

could you explain please why there's a need for a sandbox in the
first-place?

OS security protects the system and the other users from you.

A sandbox protects you yourself from code that's run "as you".

(That is, protects your files, etc.)

 
 I disagree. OS security can and does protect the user's files from code
 that's run "as the user" <-this is a bad concept.

If the code that gets run "as the user" is malicious, and there are no 
additional guards, then the code could chmod any read-only file you have 
and then edit it, according to its malicious goals. In practice, these 
additional guards constitute the Sand Box.

 current OSes use ACLs (windows, linux, etc..) and there's nothing
 stopping you from defining a file to be read only, or non-executable to
 protect data, and the current practice is to define "users" for deamons
 in order to protect data.

Not on my servers, they don't. I rely solely on user/group stuff. And I 
find it adequate.

 that's why apache runs with user www-data with
  its own ACL rules.

Apache has run as "www-data" or whatever, since the beginning of time, 
and that has been because it is natural and "obvious" to give the WWW 
server its own identity.

 you can achieve perfect security with this scheme if
 you invest enough time to create a separate "user" for each process. 

If this were so simple, then we'd have no issue with this entire subject 
-- for the last 5 years.

To put it another way, if the WWW server could run every user's code as 
a separate OS user, then of course things would be different. But the 
average Unix (Linux, etc) only has 16 bits of information to identify 
the "user". And sites like Google have users in the Billions. So, it's 
not a viable option.

 as an example, I can run my browser as a different limited user or use a
 browser which runs inside a sandbox. I can get the same protection from
 both but the sandbox solution has more overhead.

Server and client problems should be kept separate in one's mind set.

 it's easy to see all the problems with manually defining ACLs.
 Newer OSes based on the concept of "capabilities" remove all those
 problems.

"All those problems". You've been listening to marketing talk.

 such OSes give processes defined capabilities unrelated to any
 concept of a user (the concept of users is defined on top of the
 capabilities mechanism).

I was the Oracle DB Head Administrator in the early '90s at a local 
University. The concept of Roles was introduced then by Oracle. I 
actually got pretty excited about this. Instead of Mary, Jo-Anne, and 
Jane all having their respective read, write and update rights, I could 
define Roles (which is pretty near the Capabilities concept), so that 
Updater of Student Credits, Updater of Student Addresses, Updater of 
Class Information, etc. could all be defined, and when any of the girls 
went on holidays, I could simply assign the Role to the back-up person, 
instead of spending days on fixing read-update-write rights for 
individual table columns and/or views.

 Capabilities are basically the same as OOP - simplified example:
 currently OSes are written in a procedural way, there are global data
 structures and global system calls. i.e. you print to screen via
 Stdout(text); in D which just calls in the end the appropriate syscall.
 in a capabilities based OS, there is no such global syscalls/functions.

...
 No sandboxes/VMs/any other emulation layer is needed.

Gee, nice.

Still, D has to relate to what's going on today.

Jun 19 2008

"Nick Sabalausky" <a a.a> writes:

"Yigal Chripun" <yigal100 gmail.com> wrote in message 
news:g3ekcq$82e$1 digitalmars.com...
 Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?

 OS security protects the system and the other users from you.

 A sandbox protects you yourself from code that's run "as you".

 (That is, protects your files, etc.)

 I disagree. OS security can and does protect the user's files from code
 that's run "as the user" <-this is a bad concept.

 current OSes use ACLs (windows, linux, etc..) and there's nothing
 stopping you from defining a file to be read only, or non-executable to
 protect data, and the current practice is to define "users" for deamons
 in order to protect data. that's why apache runs with user www-data with
 its own ACL rules. you can achieve perfect security with this scheme if
 you invest enough time to create a separate "user" for each process.
 as an example, I can run my browser as a different limited user or use a
 browser which runs inside a sandbox. I can get the same protection from
 both but the sandbox solution has more overhead.

 it's easy to see all the problems with manually defining ACLs.
 Newer OSes based on the concept of "capabilities" remove all those
 problems. such OSes give processes defined capabilities unrelated to any
 concept of a user (the concept of users is defined on top of the
 capabilities mechanism).
 Capabilities are basically the same as OOP - simplified example:
 currently OSes are written in a procedural way, there are global data
 structures and global system calls. i.e. you print to screen via
 Stdout(text); in D which just calls in the end the appropriate syscall.
 in a capabilities based OS, there is no such global syscalls/functions.
 you need to hold an output instance (a handle in the OS - a Capability)
 in order to call its print method. only if the process has that instance
 it can print to the screen. security is implemented via the explicit
 passing of such instances. so if the program received an output
 instance, it received the right to print to the screen.

 No sandboxes/VMs/any other emulation layer is needed.

 --Yigal

If I understand all this right, it sounds like this is how it works within 
the context of browser plugins and applets embedded in a webpage (ie, 
something like Flash/Java Applet/ActiveX):

Old (current) way:
A browser is run as user X. Thus, the browser can do anything user X can do. 
A browser plugin, by its nature, can do anything the browser can do (read 
locally stored webpages, read/write cookies cache and browser history, 
delete everything in /home/userX, etc). So to prevent a malicious webpage 
from embedding something that...well, acts maliciously, there are two 
options:

1. The browser plugin has to *be* a sandboxing platform like Flash or Java 
Applets, but unlike ActiveX. This plugin/platform is trusted to not expose 
unsafe things to the applets running inside of it.

2. (Better) The browser sets up a special limited-rights user for 
plugins/applets (or optionally, one for each plugin/applet, for 
finer-grained control). The plugin/applet is run as this limited rights 
user.

New way:
Sounds like basically the same thing except replace "user X" with "a few OS 
handles", and "browser creates 'browserPlugin' user" with "browser 
selectively passes its own OS handles to the plugins as it sees fit"?

And I suppose you configure the OS to grant/disallow these handles in more 
or less the same way user rights are currently granted? Except they're 
granted to programs in addition to/instead of users? And I'd assume you'd 
still need some sort of ACLs so a program can't just go, "Aha! I need to 
open/save files, so I got a 'write file' handle and a 'read file' handle! 
Now I can use those handles to read all of the person's private data and 
overwrite the system files with pictures of potatoes!"

Jun 19 2008

Yigal Chripun <yigal100 gmail.com> writes:

Nick Sabalausky wrote:
 "Yigal Chripun" <yigal100 gmail.com> wrote in message 
 news:g3ekcq$82e$1 digitalmars.com...
 Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?

 OS security protects the system and the other users from you.

 A sandbox protects you yourself from code that's run "as you".

 (That is, protects your files, etc.)

 I disagree. OS security can and does protect the user's files from code
 that's run "as the user" <-this is a bad concept.

 current OSes use ACLs (windows, linux, etc..) and there's nothing
 stopping you from defining a file to be read only, or non-executable to
 protect data, and the current practice is to define "users" for deamons
 in order to protect data. that's why apache runs with user www-data with
 its own ACL rules. you can achieve perfect security with this scheme if
 you invest enough time to create a separate "user" for each process.
 as an example, I can run my browser as a different limited user or use a
 browser which runs inside a sandbox. I can get the same protection from
 both but the sandbox solution has more overhead.

 it's easy to see all the problems with manually defining ACLs.
 Newer OSes based on the concept of "capabilities" remove all those
 problems. such OSes give processes defined capabilities unrelated to any
 concept of a user (the concept of users is defined on top of the
 capabilities mechanism).
 Capabilities are basically the same as OOP - simplified example:
 currently OSes are written in a procedural way, there are global data
 structures and global system calls. i.e. you print to screen via
 Stdout(text); in D which just calls in the end the appropriate syscall.
 in a capabilities based OS, there is no such global syscalls/functions.
 you need to hold an output instance (a handle in the OS - a Capability)
 in order to call its print method. only if the process has that instance
 it can print to the screen. security is implemented via the explicit
 passing of such instances. so if the program received an output
 instance, it received the right to print to the screen.

 No sandboxes/VMs/any other emulation layer is needed.

 --Yigal

 
 If I understand all this right, it sounds like this is how it works within 
 the context of browser plugins and applets embedded in a webpage (ie, 
 something like Flash/Java Applet/ActiveX):
 
 Old (current) way:
 A browser is run as user X. Thus, the browser can do anything user X can do. 
 A browser plugin, by its nature, can do anything the browser can do (read 
 locally stored webpages, read/write cookies cache and browser history, 
 delete everything in /home/userX, etc). So to prevent a malicious webpage 
 from embedding something that...well, acts maliciously, there are two 
 options:
 
 1. The browser plugin has to *be* a sandboxing platform like Flash or Java 
 Applets, but unlike ActiveX. This plugin/platform is trusted to not expose 
 unsafe things to the applets running inside of it.
 
 2. (Better) The browser sets up a special limited-rights user for 
 plugins/applets (or optionally, one for each plugin/applet, for 
 finer-grained control). The plugin/applet is run as this limited rights 
 user.
 

yep

 New way:
 Sounds like basically the same thing except replace "user X" with "a few OS 
 handles", and "browser creates 'browserPlugin' user" with "browser 
 selectively passes its own OS handles to the plugins as it sees fit"?

not exactly the same thing. the major difference is this: with
users/roles/groups/rules/ACLs/etc.. the security is separate from the
code. see explanation below.
 
 And I suppose you configure the OS to grant/disallow these handles in more 
 or less the same way user rights are currently granted? Except they're 
 granted to programs in addition to/instead of users? And I'd assume you'd 
 still need some sort of ACLs so a program can't just go, "Aha! I need to 
 open/save files, so I got a 'write file' handle and a 'read file' handle! 
 Now I can use those handles to read all of the person's private data and 
 overwrite the system files with pictures of potatoes!"
 

the concept of users in the system is implemented /on top/ of the
Capabilities mechanisms in the kernel (or actually a micro-kernel to be
precise). The Kernel has no concept of users at all, this is all
implemented in user space.
think of it like this: processes on the system are entities that have
certain capabilities. they can exchange those capabilities with each
other. users would be implemented in that system also as entities (of a
different kind) that can have the same capabilities.  This means that
security is implemented at a lower level.

Here's a snippet from the relevant article on wikipedia:

suppose that the user program successfully executes the following statement:

    int fd = open("/etc/passwd", O_RDWR);

The variable fd now contains the index of a file descriptor in the
process's file descriptor table. This file descriptor is a capability.
Its existence in the process's file descriptor table is sufficient to
know that the process does indeed have legitimate access to the object.
A key feature of this arrangement is that the file descriptor table is
in kernel memory and cannot be directly manipulated by the user program.

Jun 20 2008

"Nick Sabalausky" <a a.a> writes:

"Yigal Chripun" <yigal100 gmail.com> wrote in message 
news:g3gltb$1b4a$1 digitalmars.com...
 Nick Sabalausky wrote:
 "Yigal Chripun" <yigal100 gmail.com> wrote in message
 news:g3ekcq$82e$1 digitalmars.com...
 Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?

 OS security protects the system and the other users from you.

 A sandbox protects you yourself from code that's run "as you".

 (That is, protects your files, etc.)

 I disagree. OS security can and does protect the user's files from code
 that's run "as the user" <-this is a bad concept.

 current OSes use ACLs (windows, linux, etc..) and there's nothing
 stopping you from defining a file to be read only, or non-executable to
 protect data, and the current practice is to define "users" for deamons
 in order to protect data. that's why apache runs with user www-data with
 its own ACL rules. you can achieve perfect security with this scheme if
 you invest enough time to create a separate "user" for each process.
 as an example, I can run my browser as a different limited user or use a
 browser which runs inside a sandbox. I can get the same protection from
 both but the sandbox solution has more overhead.

 it's easy to see all the problems with manually defining ACLs.
 Newer OSes based on the concept of "capabilities" remove all those
 problems. such OSes give processes defined capabilities unrelated to any
 concept of a user (the concept of users is defined on top of the
 capabilities mechanism).
 Capabilities are basically the same as OOP - simplified example:
 currently OSes are written in a procedural way, there are global data
 structures and global system calls. i.e. you print to screen via
 Stdout(text); in D which just calls in the end the appropriate syscall.
 in a capabilities based OS, there is no such global syscalls/functions.
 you need to hold an output instance (a handle in the OS - a Capability)
 in order to call its print method. only if the process has that instance
 it can print to the screen. security is implemented via the explicit
 passing of such instances. so if the program received an output
 instance, it received the right to print to the screen.

 No sandboxes/VMs/any other emulation layer is needed.

 --Yigal

 If I understand all this right, it sounds like this is how it works 
 within
 the context of browser plugins and applets embedded in a webpage (ie,
 something like Flash/Java Applet/ActiveX):

 Old (current) way:
 A browser is run as user X. Thus, the browser can do anything user X can 
 do.
 A browser plugin, by its nature, can do anything the browser can do (read
 locally stored webpages, read/write cookies cache and browser history,
 delete everything in /home/userX, etc). So to prevent a malicious webpage
 from embedding something that...well, acts maliciously, there are two
 options:

 1. The browser plugin has to *be* a sandboxing platform like Flash or 
 Java
 Applets, but unlike ActiveX. This plugin/platform is trusted to not 
 expose
 unsafe things to the applets running inside of it.

 2. (Better) The browser sets up a special limited-rights user for
 plugins/applets (or optionally, one for each plugin/applet, for
 finer-grained control). The plugin/applet is run as this limited rights
 user.

 yep

 New way:
 Sounds like basically the same thing except replace "user X" with "a few 
 OS
 handles", and "browser creates 'browserPlugin' user" with "browser
 selectively passes its own OS handles to the plugins as it sees fit"?

 not exactly the same thing. the major difference is this: with
 users/roles/groups/rules/ACLs/etc.. the security is separate from the
 code. see explanation below.
 And I suppose you configure the OS to grant/disallow these handles in 
 more
 or less the same way user rights are currently granted? Except they're
 granted to programs in addition to/instead of users? And I'd assume you'd
 still need some sort of ACLs so a program can't just go, "Aha! I need to
 open/save files, so I got a 'write file' handle and a 'read file' handle!
 Now I can use those handles to read all of the person's private data and
 overwrite the system files with pictures of potatoes!"

 the concept of users in the system is implemented /on top/ of the
 Capabilities mechanisms in the kernel (or actually a micro-kernel to be
 precise). The Kernel has no concept of users at all, this is all
 implemented in user space.
 think of it like this: processes on the system are entities that have
 certain capabilities. they can exchange those capabilities with each
 other. users would be implemented in that system also as entities (of a
 different kind) that can have the same capabilities.  This means that
 security is implemented at a lower level.

 Here's a snippet from the relevant article on wikipedia:

 suppose that the user program successfully executes the following 
 statement:

    int fd = open("/etc/passwd", O_RDWR);

 The variable fd now contains the index of a file descriptor in the
 process's file descriptor table. This file descriptor is a capability.
 Its existence in the process's file descriptor table is sufficient to
 know that the process does indeed have legitimate access to the object.
 A key feature of this arrangement is that the file descriptor table is
 in kernel memory and cannot be directly manipulated by the user program.

Ok, I think I'm starting to get it, but I'm still a little fuzzy on some 
stuff (for clarity I'm going to use the terms "human user" and "OS user" to 
disambiguate what I mean by "the user". Human user of course being the 
actual person, and os user being the os's concept of a user):

Suppose I'm writing a hex editor for one of these capabilities-based OSes. 
I've got to be able to read/write various files.

Non-capabilities way:
The human user runs my app. The app is run as either OS user "userX", or 
some special OS user that the program was configured to run as. The human 
user tells my app, "Open file 'fileX' (passed by filename)". My app then 
says to the OS, "Open file X for me, based on the credentials of whatever OS 
user I'm being run as". The OS then looks up the ACL info and grants/denies 
access accordingly.

Capabilities way:
The human user runs my app. The human's OS user object (name of this object 
is 'userX') is passed to my app kinda like a command line paramater would 
be. The human user tells my app, "Open file 'fileX' (passed by filename)". 
My app then *doesn't* talk to the OS, but instead goes to userX, 
"userX.openfile(fileX, whateverAccessLevel)". If the OS user has that 
capability and is willing to give it to my app, then it returns the 
appropriate capability. If the OS user doesn't have that capability then it 
requests it from whatever its authority is (who? The OS?), just like how the 
app requested it from the OS user. Somehow, the OS user's authority (if it's 
successfully able to retreive access from its authority) decides whether or 
not to allow userX access (I still can only imagine this part involves come 
sort of ACL or ACL-equivilent).

Jun 20 2008

Yigal Chripun <yigal100 gmail.com> writes:

Nick Sabalausky wrote:
 "Yigal Chripun" <yigal100 gmail.com> wrote in message 
 news:g3gltb$1b4a$1 digitalmars.com...
 Nick Sabalausky wrote:
 "Yigal Chripun" <yigal100 gmail.com> wrote in message
 news:g3ekcq$82e$1 digitalmars.com...
 Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?

 OS security protects the system and the other users from you.

 A sandbox protects you yourself from code that's run "as you".

 (That is, protects your files, etc.)

 I disagree. OS security can and does protect the user's files from code
 that's run "as the user" <-this is a bad concept.

 current OSes use ACLs (windows, linux, etc..) and there's nothing
 stopping you from defining a file to be read only, or non-executable to
 protect data, and the current practice is to define "users" for deamons
 in order to protect data. that's why apache runs with user www-data with
 its own ACL rules. you can achieve perfect security with this scheme if
 you invest enough time to create a separate "user" for each process.
 as an example, I can run my browser as a different limited user or use a
 browser which runs inside a sandbox. I can get the same protection from
 both but the sandbox solution has more overhead.

 it's easy to see all the problems with manually defining ACLs.
 Newer OSes based on the concept of "capabilities" remove all those
 problems. such OSes give processes defined capabilities unrelated to any
 concept of a user (the concept of users is defined on top of the
 capabilities mechanism).
 Capabilities are basically the same as OOP - simplified example:
 currently OSes are written in a procedural way, there are global data
 structures and global system calls. i.e. you print to screen via
 Stdout(text); in D which just calls in the end the appropriate syscall.
 in a capabilities based OS, there is no such global syscalls/functions.
 you need to hold an output instance (a handle in the OS - a Capability)
 in order to call its print method. only if the process has that instance
 it can print to the screen. security is implemented via the explicit
 passing of such instances. so if the program received an output
 instance, it received the right to print to the screen.

 No sandboxes/VMs/any other emulation layer is needed.

 --Yigal

 If I understand all this right, it sounds like this is how it works 
 within
 the context of browser plugins and applets embedded in a webpage (ie,
 something like Flash/Java Applet/ActiveX):

 Old (current) way:
 A browser is run as user X. Thus, the browser can do anything user X can 
 do.
 A browser plugin, by its nature, can do anything the browser can do (read
 locally stored webpages, read/write cookies cache and browser history,
 delete everything in /home/userX, etc). So to prevent a malicious webpage
 from embedding something that...well, acts maliciously, there are two
 options:

 1. The browser plugin has to *be* a sandboxing platform like Flash or 
 Java
 Applets, but unlike ActiveX. This plugin/platform is trusted to not 
 expose
 unsafe things to the applets running inside of it.

 2. (Better) The browser sets up a special limited-rights user for
 plugins/applets (or optionally, one for each plugin/applet, for
 finer-grained control). The plugin/applet is run as this limited rights
 user.

 yep

 New way:
 Sounds like basically the same thing except replace "user X" with "a few 
 OS
 handles", and "browser creates 'browserPlugin' user" with "browser
 selectively passes its own OS handles to the plugins as it sees fit"?

 not exactly the same thing. the major difference is this: with
 users/roles/groups/rules/ACLs/etc.. the security is separate from the
 code. see explanation below.
 And I suppose you configure the OS to grant/disallow these handles in 
 more
 or less the same way user rights are currently granted? Except they're
 granted to programs in addition to/instead of users? And I'd assume you'd
 still need some sort of ACLs so a program can't just go, "Aha! I need to
 open/save files, so I got a 'write file' handle and a 'read file' handle!
 Now I can use those handles to read all of the person's private data and
 overwrite the system files with pictures of potatoes!"

 the concept of users in the system is implemented /on top/ of the
 Capabilities mechanisms in the kernel (or actually a micro-kernel to be
 precise). The Kernel has no concept of users at all, this is all
 implemented in user space.
 think of it like this: processes on the system are entities that have
 certain capabilities. they can exchange those capabilities with each
 other. users would be implemented in that system also as entities (of a
 different kind) that can have the same capabilities.  This means that
 security is implemented at a lower level.

 Here's a snippet from the relevant article on wikipedia:

 suppose that the user program successfully executes the following 
 statement:

    int fd = open("/etc/passwd", O_RDWR);

 The variable fd now contains the index of a file descriptor in the
 process's file descriptor table. This file descriptor is a capability.
 Its existence in the process's file descriptor table is sufficient to
 know that the process does indeed have legitimate access to the object.
 A key feature of this arrangement is that the file descriptor table is
 in kernel memory and cannot be directly manipulated by the user program.

 
 Ok, I think I'm starting to get it, but I'm still a little fuzzy on some 
 stuff (for clarity I'm going to use the terms "human user" and "OS user" to 
 disambiguate what I mean by "the user". Human user of course being the 
 actual person, and os user being the os's concept of a user):
 
 Suppose I'm writing a hex editor for one of these capabilities-based OSes. 
 I've got to be able to read/write various files.
 
 Non-capabilities way:
 The human user runs my app. The app is run as either OS user "userX", or 
 some special OS user that the program was configured to run as. The human 
 user tells my app, "Open file 'fileX' (passed by filename)". My app then 
 says to the OS, "Open file X for me, based on the credentials of whatever OS 
 user I'm being run as". The OS then looks up the ACL info and grants/denies 
 access accordingly.
 
 Capabilities way:
 The human user runs my app. The human's OS user object (name of this object 
 is 'userX') is passed to my app kinda like a command line paramater would 
 be. The human user tells my app, "Open file 'fileX' (passed by filename)". 
 My app then *doesn't* talk to the OS, but instead goes to userX, 
 "userX.openfile(fileX, whateverAccessLevel)". If the OS user has that 
 capability and is willing to give it to my app, then it returns the 
 appropriate capability. If the OS user doesn't have that capability then it 
 requests it from whatever its authority is (who? The OS?), just like how the 
 app requested it from the OS user. Somehow, the OS user's authority (if it's 
 successfully able to retreive access from its authority) decides whether or 
 not to allow userX access (I still can only imagine this part involves come 
 sort of ACL or ACL-equivilent).
 
 

first a wikipedia link: http://en.wikipedia.org/wiki/Capabilities
now let's try to figure the second way:
let's define a system with 3 users: root, userA, userB. from a security
POV those are 3 entities with caps [I'll use that inteas of typing
"cappabilities"].
userA has all caps [view, edit, accessGUI] for image file A, and a play
cap for audio file B.
when the real person runs a gimp process on file A as userA the gimp
process receives the user's caps on creation. the gimp process can use
the edit cap to edit file A, and show the changes on screen via the
accessGUI cap.

in your example think of userX as a list of caps. when your human user
runs the app it runs it with app.run(listOfCapsForApp) {that's pseudo code}
the app doesn't know or care about the user. it has a list of caps. on a
 *nix system you can include header files and call syscalls which are
just global functions provided by the OS. with caps, the headers define
something like classes [which in turn need to receive a different set of
caps to create an instance of themselves]. your app can get a File cap
(which is an OS object instance) that define the relavant syscalls as
methods on that object. so instead of:
openfile(fileX, whateverAccessLevel)
you get a fileX from the user and do a filex.open()
you do not pass the accessLevel to Open, since the fact you have that
fileX reference in your app implies you already have the needed access
level to use its methods.

instead of separate access rules, think of it like this:
if you do not have a fileX object than you cannot do anything on that
file. if you have a fileX object, you can use all it's methods.
if you have an invariant fileX (if I use the D terminology) than you can
only use the invariant methods.

The system has a predefined set of caps when booted. you can use those
caps to create more caps and pass those to different processes, and so
on.. the UI can translate user actions to caps - for example: you can
think of opening a file in a dialog in your editor app as [implicitly]
providing the process a cap to that file. if you have only read-only
access to the file (as defined in your OS user) than you can only give
the process a read-only cap.

one last note: caps allow you to limit behavior of processes - even if
your MP3 file contains a virus that deletes all your files, you can
safely play it since the player only has a cap to play that file, and
cannot run arbitrary code on the system (even if you run the player as
root!) since the player process doesn't need any concept of a user to
work and it only cares for the caps it currently has.

Jun 20 2008

Don <nospam nospam.com.au> writes:

Nick Sabalausky wrote:
 "David Jeske" <davidj gmail.com> wrote in message 
 news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)

 It seems that security/verifiability, and ease of executing on an unknown 
 target processor are the two major benefits of a VM.

 However, you might be interested in looking at software based fault 
 isolation if you have not seen it. It may make you reconsider how much you 
 need a VM to implement code security. There is a pretty simple explanation 
 here:

  http://www.cs.unm.edu/~riesen/prop/node16.html

 
 
 Thanks. Interesting read.
 
 Although expanding *every* write/jump/(and maybe read) from one instruction 
 each into five instructions each kinda makes me cringe (But maybe it 
 wouldn't need to be a 1-to-5 on every single write/jump after some sort of 
 optimizing-compiler-style magic?). I know that paper claims an overhead of 
 only 4.3% (I wish it had a link to an online copy of the benchmark 
 tests/results), but it was written ten years ago and, as I understand it, 
 pipelining and cache concerns make a far larger speed difference today than 
 they did back then. And, while I'm no x86 asm expert, what they're proposing 
 strikes me as something that might be rather pipeline/cache-unfriendly.

It's quite unnecessary on an x86. The x86 has page protection 
implemented in hardware. It's impossible to write to any memory which 
the OS hasn't explicitly given you.
The problem occurs when the OS has buggy APIs which have exposed too much...

 
 Plus, maybe this has changed in recent years, but back when I was doing x86 
 asm (also about ten or so years ago), the x86 had *very* few general-purpose 
 registers. Like 4 or 5, IIRC. If that's still the case, that would just make 
 performance worse since the 5-6 extra registers this paper suggests would 
 turn into additional memory access (And I imagine they'd be cache-killing 
 accesses). I'm not sure that they mean by i860, though, 
 Intel-something-or-other probably, but I assume i860 isn't the same as 
 i86/x86.

It was an old Intel CPU.

 
 Granted, I know performance is a secondary, at best, concern for the types 
 of situations where you would want a sandbox. But, I can't help thinking 
 about rasterized drawing, video decompression, and other things Flash does, 
 and wonder what Flash would be like if the browser placed the flash plugin 
 (I mean the actual browser plugin, not an SWF) into this style of sandbox.
 
 Of course, VMs have overhead too (though I doubt Flash's rendering is done 
 in a VM), but I'm not up-to-date enough on all the modern VM implementation 
 details to know how a modern VM's overhead would compare to this. Maybe I'm 
 just confused, but I wonder if a just-in-time-compiled VM would have the 
 potential to be faster than this, simply because the VM's bytecode 
 (presumably) has no way of expressing unsafe behaviors, and therefore 
 anything translated by the VM itself from that "safe" bytecode to real 
 native code would not need those extra runtime checks. (Hmm, kinda weird to 
 think of a VM potentially being *faster* than native code for something).

Jun 17 2008

"Nick Sabalausky" <a a.a> writes:

"Don" <nospam nospam.com.au> wrote in message 
news:g37vm8$114c$1 digitalmars.com...
 Nick Sabalausky wrote:
 "David Jeske" <davidj gmail.com> wrote in message 
 news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)

 It seems that security/verifiability, and ease of executing on an 
 unknown target processor are the two major benefits of a VM.

 However, you might be interested in looking at software based fault 
 isolation if you have not seen it. It may make you reconsider how much 
 you need a VM to implement code security. There is a pretty simple 
 explanation here:

  http://www.cs.unm.edu/~riesen/prop/node16.html


 Thanks. Interesting read.

 Although expanding *every* write/jump/(and maybe read) from one 
 instruction each into five instructions each kinda makes me cringe (But 
 maybe it wouldn't need to be a 1-to-5 on every single write/jump after 
 some sort of optimizing-compiler-style magic?). I know that paper claims 
 an overhead of only 4.3% (I wish it had a link to an online copy of the 
 benchmark tests/results), but it was written ten years ago and, as I 
 understand it, pipelining and cache concerns make a far larger speed 
 difference today than they did back then. And, while I'm no x86 asm 
 expert, what they're proposing strikes me as something that might be 
 rather pipeline/cache-unfriendly.

 It's quite unnecessary on an x86. The x86 has page protection implemented 
 in hardware. It's impossible to write to any memory which the OS hasn't 
 explicitly given you.
 The problem occurs when the OS has buggy APIs which have exposed too 
 much...

What's the difference between that x86 page protection and whatever that new 
feature is (something about process protection I think?) that CPUs have just 
been starting to get?  (boy, I'm out of the loop on this stuff)

Jun 17 2008

Don <nospam nospam.com.au> writes:

Nick Sabalausky wrote:
 "Don" <nospam nospam.com.au> wrote in message 
 news:g37vm8$114c$1 digitalmars.com...
 Nick Sabalausky wrote:
 "David Jeske" <davidj gmail.com> wrote in message 
 news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)

 It seems that security/verifiability, and ease of executing on an 
 unknown target processor are the two major benefits of a VM.

 However, you might be interested in looking at software based fault 
 isolation if you have not seen it. It may make you reconsider how much 
 you need a VM to implement code security. There is a pretty simple 
 explanation here:

  http://www.cs.unm.edu/~riesen/prop/node16.html

 Thanks. Interesting read.

 Although expanding *every* write/jump/(and maybe read) from one 
 instruction each into five instructions each kinda makes me cringe (But 
 maybe it wouldn't need to be a 1-to-5 on every single write/jump after 
 some sort of optimizing-compiler-style magic?). I know that paper claims 
 an overhead of only 4.3% (I wish it had a link to an online copy of the 
 benchmark tests/results), but it was written ten years ago and, as I 
 understand it, pipelining and cache concerns make a far larger speed 
 difference today than they did back then. And, while I'm no x86 asm 
 expert, what they're proposing strikes me as something that might be 
 rather pipeline/cache-unfriendly.

 It's quite unnecessary on an x86. The x86 has page protection implemented 
 in hardware. It's impossible to write to any memory which the OS hasn't 
 explicitly given you.
 The problem occurs when the OS has buggy APIs which have exposed too 
 much...

 
 What's the difference between that x86 page protection and whatever that new 
 feature is (something about process protection I think?) that CPUs have just 
 been starting to get?  (boy, I'm out of the loop on this stuff) 

The page protection is implemented by the OS, and only applies to user 
apps, not kernel drivers.

 From reading the AMD64 System Programming manual, it seems that the 
'secure virtual machine' feature is roughly the same thing, except at an 
even deeper level: it prevents the OS kernel from accessing specific 
areas of memory or I/O. So it even allows you to sandbox the kernel (!)

Jun 17 2008

Georg Wrede <georg nospam.org> writes:

Don wrote:
 Nick Sabalausky wrote:
 
 What's the difference between that x86 page protection and whatever 
 that new feature is (something about process protection I think?) that 
 CPUs have just been starting to get?  (boy, I'm out of the loop on 
 this stuff) 

 
 The page protection is implemented by the OS, and only applies to user 
 apps, not kernel drivers.
 
  From reading the AMD64 System Programming manual, it seems that the 
 'secure virtual machine' feature is roughly the same thing, except at an 
 even deeper level: it prevents the OS kernel from accessing specific 
 areas of memory or I/O. So it even allows you to sandbox the kernel (!)

Gawhhhhh.

But seriously, that is the way to let you run virtual machines where 
there could be several kernels, possibly of several operating systems.

So, when processors evolve, and operating systems increasingly take 
advantage of the features of the existing processors, having the /next/ 
processor generation have yet another level of "priority" guarantees 
that the operating systems for the previous processor can all be 
virtualised with 100% accuracy, 100% efficiency, and 100% security.

Without this it would be virtually (no pun intended) impossible.

---

Now, with the majority of operating systems today (at least most Linuxes 
are compiled with the 386 as the target while it's about 5 years since 
"anybody ever" has tried to run Linux on a 386 -- dunno about Windows, 
but I assume most Windows versions are theoretically runnable on a 386, 
too), this would not be a priority.

Actually it is a matter of Prudent Development. The only way you (as a 
processor manufacturer) can literally guarantee that the previous 
processor can be fully virtualised, is to add yet another layer of 
privilege.

Jun 18 2008

Georg Wrede <georg nospam.org> writes:

PatrickD wrote:
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

I've read a number of his previous rants, and I've generally found them 
interesting, informative, and thought provoking. Sometimes even 
entertaining. This one was an exception.

 <Steve Yegge> He told me the other day, [talking about] one of my
 blog rants, that he didn't agree with the point that I'd made that
 virtual machines are "obvious". You know? I mean, of course you use a
 virtual machine!
 
 But he's a compiler dude, and he says they're a sham, they're a
 farce, "I don't get it!" And so I explained it [my viewpoint] to him,
 and he went: Ohhhhhhh. </Steve Yegge>

The above story, and the fact that he goes on ranting irrespective of 
the slide sequence, the fact that he blatantly generalizes, derides, 
self-promotes, the fact that the transcript includes the superfluous 
interjections from spoken language, that he cavalierly exaggerates, and 
some other details -- collectively lead me to think he's, ehhh, in an 
"accelerated state of mind".

There are basically a few different ways to achieve that state of mind 
(of which I'm not suggesting any, I'm merely enumerating the most usual 
ones here).

  - A speed trip.
  - The upper phase of bipolar syndrome.
  - A state of mind that can be deliberately achieved through repeated 
self-assertion and self-excitement.
  - The pre-onset stage of a nervous breakdown.
  - A basically "God, I'm good" mindset.

Now, I'm not saying it's any of these, but it sure looks like it. But 
enough of that.

---

What especially made me glad here was (yet another) "subliminal" 
advertisement for D. At least we get exposure.

PS, well, most of his blog readers are intelligent-wannabes, 
professional programmer-wannabes, and the readership of his blogs must 
be immense. The more controversial the blog, the more readers you get. A 
littel like tabloids, where you know that "The President escapes death" 
on the front page translates to "he almost stepped on a bee that could 
have stung him". Just take them with a truckload of salt. But D being 
mentioned there gets a lot of eyeballs. And the D related stuff was 
written with obvious respect for D! One really couldn't ask for anything 
better.

Jun 18 2008

Bill Baxter <dnewsgroup billbaxter.com> writes:

Georg Wrede wrote:
 PatrickD wrote:
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

 
 I've read a number of his previous rants, and I've generally found them 
 interesting, informative, and thought provoking. Sometimes even 
 entertaining. This one was an exception.
 
 <Steve Yegge> He told me the other day, [talking about] one of my
 blog rants, that he didn't agree with the point that I'd made that
 virtual machines are "obvious". You know? I mean, of course you use a
 virtual machine!

 But he's a compiler dude, and he says they're a sham, they're a
 farce, "I don't get it!" And so I explained it [my viewpoint] to him,
 and he went: Ohhhhhhh. </Steve Yegge>

 
 The above story, and the fact that he goes on ranting irrespective of 
 the slide sequence, the fact that he blatantly generalizes, derides, 
 self-promotes, the fact that the transcript includes the superfluous 
 interjections from spoken language, that he cavalierly exaggerates, and 
 some other details -- collectively lead me to think he's, ehhh, in an 
 "accelerated state of mind".

My guess is he's more accustomed to writing blogs than presenting them 
as talks.  So he was probably just a little hyped up on stage fright.  I 
think he makes some very good points in that talk, and they mostly make 
sense if you start with the premise that all software can and should be 
delivered over the web.  He's working for Google, and before that at 
Amazon, so it's not surprising that his world view is skewed in that 
web-centric direction.

So I think he's just forgetting (or deliberately ignoring) the fact that 
someone still has to write that VM and the operating system it runs on, 
and those better run as fast as possible or no one will care how 
wonderfully "dynamic" it is.

--bb

Jun 18 2008

Georg Wrede <georg nospam.org> writes:

Bill Baxter wrote:
 Georg Wrede wrote:
 PatrickD wrote:

 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

 I've read a number of his previous rants, and I've generally found 
 them interesting, informative, and thought provoking. Sometimes even 
 entertaining. This one was an exception.

 <Steve Yegge> He told me the other day, [talking about] one of my
 blog rants, that he didn't agree with the point that I'd made that
 virtual machines are "obvious". You know? I mean, of course you use a
 virtual machine!

 But he's a compiler dude, and he says they're a sham, they're a
 farce, "I don't get it!" And so I explained it [my viewpoint] to him,
 and he went: Ohhhhhhh. </Steve Yegge>

 The above story, and the fact that he goes on ranting irrespective of 
 the slide sequence, the fact that he blatantly generalizes, derides, 
 self-promotes, the fact that the transcript includes the superfluous 
 interjections from spoken language, that he cavalierly exaggerates, 
 and some other details -- collectively lead me to think he's, ehhh, in 
 an "accelerated state of mind".

 
 My guess is he's more accustomed to writing blogs than presenting them 
 as talks.  So he was probably just a little hyped up on stage fright.

Let's hope it. (OT: it took me more than the promised 20 minutes to read 
the stuff. Very much more. I guess I'm a slow reader.) :-)

 I think he makes some very good points in that talk, and they mostly make 
 sense if you start with the premise that all software can and should be 
 delivered over the web.  He's working for Google, and before that at 
 Amazon, so it's not surprising that his world view is skewed in that 
 web-centric direction.
 
 So I think he's just forgetting (or deliberately ignoring) the fact that 
 someone still has to write that VM and the operating system it runs on, 
 and those better run as fast as possible or no one will care how 
 wonderfully "dynamic" it is.

Considering that all the languages he talks about still have to be 
/compiled/ for the VM (JIT or no JIT), I have a hard time seeing the 
case for VMs being rock-solid and compelling.

Think about it. If I have a web site where I let viewers run their own 
code on my server, I could simply provide them with a rigged D compiler. 
The compiler would (or a preprocessor, it would actually be easier for 
me) flag no-nos in their source code as errors. No biggie.

Or I might sandbox the running user binaries.

Of course I'd should also enforce per user quotas (or per user code 
snippet), and such. They could even write to the hard disk, and an easy 
way would be to have a virtual filesystem in a file.

---

And then there's the choice nobody seems to suggest: running a VM that 
uses the processor's own ASM as the VM language. The (e.g. D) compiler 
would enforce the exclusion of dangerous idioms.

Sure, this is more work than I'd personally care to do, but for some big 
company this should be a reasonable alternative.

---

Hmm. On second thought, there /is/ one case for the VM. And that is, the 
choice of languages. The bunch of languages he is talking about, I 
guess, are more suited for this kind of "user-tinkering" than "Real 
Languages", like D. At least some of them are somewhat usable with 
hardly any programming experience.

But that's definitely a language choice issue, and not a VM/no VM issue 
in itself.

Jun 19 2008

Bill Baxter <dnewsgroup billbaxter.com> writes:

Georg Wrede wrote:

 So I think he's just forgetting (or deliberately ignoring) the fact 
 that someone still has to write that VM and the operating system it 
 runs on, and those better run as fast as possible or no one will care 
 how wonderfully "dynamic" it is.

 
 Considering that all the languages he talks about still have to be 
 /compiled/ for the VM (JIT or no JIT), I have a hard time seeing the 
 case for VMs being rock-solid and compelling.
 
 Think about it. If I have a web site where I let viewers run their own 
 code on my server, I could simply provide them with a rigged D compiler. 
 The compiler would (or a preprocessor, it would actually be easier for 
 me) flag no-nos in their source code as errors. No biggie.
 
 Or I might sandbox the running user binaries.

Or you could just use Java's VM instead of trying to figure out how to 
make all that work.  I think that's a big part of it.  The Java VM works 
and is available today, so for people like Steve it makes sense to use it.

 ---
 
 And then there's the choice nobody seems to suggest: running a VM that 
 uses the processor's own ASM as the VM language. The (e.g. D) compiler 
 would enforce the exclusion of dangerous idioms.

That's kinda what the "virtual appliance" thing is about isn't it? 
Running an app inside a VMWare instance with some ASM as the VM's native 
tongue.

--bb

Jun 19 2008

Georg Wrede <georg nospam.org> writes:

Bill Baxter wrote:
 Georg Wrede wrote:
 
 So I think he's just forgetting (or deliberately ignoring) the fact 
 that someone still has to write that VM and the operating system it 
 runs on, and those better run as fast as possible or no one will care 
 how wonderfully "dynamic" it is.

 Considering that all the languages he talks about still have to be 
 /compiled/ for the VM (JIT or no JIT), I have a hard time seeing the 
 case for VMs being rock-solid and compelling.

 Think about it. If I have a web site where I let viewers run their own 
 code on my server, I could simply provide them with a rigged D 
 compiler. The compiler would (or a preprocessor, it would actually be 
 easier for me) flag no-nos in their source code as errors. No biggie.

 Or I might sandbox the running user binaries.

 
 Or you could just use Java's VM instead of trying to figure out how to 
 make all that work.  I think that's a big part of it.  The Java VM works 
 and is available today, so for people like Steve it makes sense to use it.

Hmm. I originally took it like he's promoting the VM as /itself/ having 
properties that make it the superior and Obvious choice. But maybe it's 
all simply about the Java VM bein easy ubiquitous and mature.

 And then there's the choice nobody seems to suggest: running a VM that 
 uses the processor's own ASM as the VM language. The (e.g. D) compiler 
 would enforce the exclusion of dangerous idioms.

 
 That's kinda what the "virtual appliance" thing is about isn't it? 
 Running an app inside a VMWare instance with some ASM as the VM's native 
 tongue.

Pretty close.

Jun 19 2008

"Nick Sabalausky" <a a.a> writes:

"Georg Wrede" <georg nospam.org> wrote in message 
news:485A582F.6050103 nospam.org...
 Or you could just use Java's VM instead of trying to figure out how to 
 make all that work.  I think that's a big part of it.  The Java VM works 
 and is available today, so for people like Steve it makes sense to use 
 it.

 Hmm. I originally took it like he's promoting the VM as /itself/ having 
 properties that make it the superior and Obvious choice. But maybe it's 
 all simply about the Java VM bein easy ubiquitous and mature.

I could be wrong, but I got the impression that he, like a lot of 
VM-proponents (but not all!), aren't considering those to be two separate 
concepts. That is, I suspect they might be confusing "the way things 
currently are" (ie, "VMs like the JVM have mature sandboxing and runtime 
reflection today, and such things aren't currently in non-VMs") with "the 
only way things can be" (ie, "You can't have things like sandboxing and 
runtime reflection without a VM").

Jun 19 2008

Walter Bright <newshound1 digitalmars.com> writes:

Georg Wrede wrote:
 Hmm. I originally took it like he's promoting the VM as /itself/ having 
 properties that make it the superior and Obvious choice. But maybe it's 
 all simply about the Java VM bein easy ubiquitous and mature.

I see the advantage of a VM as being if you're inventing a new language, 
you don't have to bother writing an optimizer, code generator, or linker.

Of course, LLVM should make that advantage moot as well.

Jun 19 2008

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:
 I see the advantage of a VM as being if you're inventing a new language, 
 you don't have to bother writing an optimizer, code generator, or linker.

And often GC, part of the standard library, some/most external modules, DBMS
interfaces, GUI widgets, etc, too :-) That's why creating a language like Boo
on the dotnet was doable by a single person in few months.

Bye,
bearophile

Jun 20 2008

D Programming

C/C++ Programming

Other

digitalmars.D - Walter did yo realy go Ohhhh?