www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Walter did yo realy go Ohhhh?

reply PatrickD <patrick.down gmail.com> writes:
http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

<Steve Yegge>
He told me the other day, [talking about] one of my blog rants, that he didn't
agree with the point that I'd made that virtual machines are "obvious". You
know? I mean, of course you use a virtual machine!

But he's a compiler dude, and he says they're a sham, they're a farce, "I don't
get it!" And so I explained it [my viewpoint] to him, and he went: Ohhhhhhh.
</Steve Yegge>
Jun 15 2008
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
PatrickD wrote:
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html
 
 <Steve Yegge> He told me the other day, [talking about] one of my
 blog rants, that he didn't agree with the point that I'd made that
 virtual machines are "obvious". You know? I mean, of course you use a
 virtual machine!
 
 But he's a compiler dude, and he says they're a sham, they're a
 farce, "I don't get it!" And so I explained it [my viewpoint] to him,
 and he went: Ohhhhhhh. </Steve Yegge>
 
Steve and I did talk about VMs, and I did say that I thought they were a sham. Maybe I did say Ohhhhh, but that was more of understanding his point of view than agreeing with it. Steve also says in his blog that the advantage of VMs is language interoperability. I don't agree, since all you need for interoperability is an ABI. For compiled languages, the C ABI serves just fine, and as long as each language has a way to get at the C ABI, you have language interoperability. Case in point - D!
Jun 15 2008
parent Jan Claeys <digitalmars janc.be> writes:
Op Sun, 15 Jun 2008 11:40:33 -0700, schreef Walter Bright:

 Steve also says in his blog that the advantage of VMs is language
 interoperability. I don't agree, since all you need for interoperability
 is an ABI. For compiled languages, the C ABI serves just fine, and as
 long as each language has a way to get at the C ABI, you have language
 interoperability.
The C ABI isn't really useful if you want (standardised) interoperability on an OO level. That's where an OO machine design (be it a VM or a physical machine design--I don't care, and I don't see the difference for language designers) can be useful. E.g., I've been wondering for some time what a hardware design like Linn's Rekursiv[*] could have meant to computer languages if it wouldn't have been killed because of various reasons (mostly financial...). [*] <http://www.cpushack.net/CPU/cpu7.html> -- JanC
Jun 29 2008
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"PatrickD" <patrick.down gmail.com> wrote in message 
news:g33e3g$pdd$1 digitalmars.com...
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

 <Steve Yegge>
 He told me the other day, [talking about] one of my blog rants, that he 
 didn't agree with the point that I'd made that virtual machines are 
 "obvious". You know? I mean, of course you use a virtual machine!

 But he's a compiler dude, and he says they're a sham, they're a farce, "I 
 don't get it!" And so I explained it [my viewpoint] to him, and he went: 
 Ohhhhhhh.
 </Steve Yegge>
I've never read much of Steve Yegge's stuff (in large part because I have better things to do than read though a book-sized blog post. "But I deliberately make them long because it's the opposite of everyone else and that makes mine stand out!" Yea, good for you, I don't care.) But after reading through the Intro, "FOO Chaos", "The right way to do unit testing", and "Static Typing's Paper Tigers", I'm now convinced Steve's full of shit. FOO Chaos: First he says "VMs are great for language interop", then he demonstrates that VMs *don't* solve the language interop issue. Ok, fine, then he scales back his claim and says "Well, they help!" Doesn't do much to convince me that VMs are "obvious". But then, the whole idea of VMs being better for language interop is preposterous anyway. After all, how do VMs work? You take a high-level-language, compile it down to a sequence of pre-defined binary opcodes, and execute. Hey! Just like a real CPU! So if you can solve language interop on a VM, you can do the same thing to solve it for native code. And what is that thing that "solves" it for VMs? (Oh that's right - it doesn't solve it, it merely *helps* it). A standard ABI, or at least something that basically boils down to a standard ABI. And that can't be done on native code...why? So the strengths of VMs (and sure, there are some - but they're limited) do not lie in language interop. And maybe I'm wrong, but I'd imagine that a bigger problem for language interop would be different languages for which there is no single machine target (native or VM) that they all have in common. The right way to do unit testing: "And [on a dynamically-typed language] when it works [for that mock data], you're like, "Oh yeah, it works!" You don't run it through a compiler. You copy it and paste it into your unit test suite. That's one unit test, right? And you copy it into your code, ok, this is your function." Soo...He's advocating the strategy of assuming something works just because it worked for your mock data? Unit tests and regression tests catch one set of bugs, a good statically-typed compiler catches another set. The sets probably intersect, but one is not a superset of the other. "To a large extent, especially in C++ and Java, the way you develop is: [step 1, 2, 3, etc.] So it's this batch cycle, right? 1950s. Submit your punch cards, please." I'm sure he didn't mean this as a serious argument, just a jab, but seriously, you could say the same thing about the scientific method. It's step-by-step a batch cycle too. Static Typing's Paper Tigers: "[Static Typing is a talisman that "keeps real tigers away". And I'm proving this by pointing out examples of big production systems written in dynamically-typed languages (While forgetting that VB and VB.NET code both supports and typically makes appropriate use of static typing)]" Ok, so you *can* make big production systems in dynamically-typed languages. So what? You *can* also do it in Perl or Assembly. I don't think anyone disputes that. You can build a whole house using a coin to drive in all your screws. But that doesn't turn screwdrivers into proverbial tiger-dispelling talismans. The question is: During the course of those programs' development (and maintenance), how much time, effort and money did they spend chasing after things that a good statically-typed language would have immediately caught/prevented? Oh, is it *those* things that are the proverbial tigers? So just because big production systems *have* been made using those languages, that automatically implies that the developers *didn't* ever come across those problems and have to spend their time overcoming them? I did agree with Steve on one thing though: "What are the odds that XML's going to wind up being less verbose than *anything*?"
Jun 15 2008
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:g33t27$2dq$1 digitalmars.com...
 "PatrickD" <patrick.down gmail.com> wrote in message 
 news:g33e3g$pdd$1 digitalmars.com...
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html

 <Steve Yegge>
 He told me the other day, [talking about] one of my blog rants, that he 
 didn't agree with the point that I'd made that virtual machines are 
 "obvious". You know? I mean, of course you use a virtual machine!

 But he's a compiler dude, and he says they're a sham, they're a farce, "I 
 don't get it!" And so I explained it [my viewpoint] to him, and he went: 
 Ohhhhhhh.
 </Steve Yegge>
I've never read much of Steve Yegge's stuff (in large part because I have better things to do than read though a book-sized blog post. "But I deliberately make them long because it's the opposite of everyone else and that makes mine stand out!" Yea, good for you, I don't care.) But after reading through the Intro, "FOO Chaos", "The right way to do unit testing", and "Static Typing's Paper Tigers", I'm now convinced Steve's full of shit. FOO Chaos: First he says "VMs are great for language interop", then he demonstrates that VMs *don't* solve the language interop issue. Ok, fine, then he scales back his claim and says "Well, they help!" Doesn't do much to convince me that VMs are "obvious". But then, the whole idea of VMs being better for language interop is preposterous anyway. After all, how do VMs work? You take a high-level-language, compile it down to a sequence of pre-defined binary opcodes, and execute. Hey! Just like a real CPU! So if you can solve language interop on a VM, you can do the same thing to solve it for native code. And what is that thing that "solves" it for VMs? (Oh that's right - it doesn't solve it, it merely *helps* it). A standard ABI, or at least something that basically boils down to a standard ABI. And that can't be done on native code...why? So the strengths of VMs (and sure, there are some - but they're limited) do not lie in language interop. And maybe I'm wrong, but I'd imagine that a bigger problem for language interop would be different languages for which there is no single machine target (native or VM) that they all have in common. The right way to do unit testing: "And [on a dynamically-typed language] when it works [for that mock data], you're like, "Oh yeah, it works!" You don't run it through a compiler. You copy it and paste it into your unit test suite. That's one unit test, right? And you copy it into your code, ok, this is your function." Soo...He's advocating the strategy of assuming something works just because it worked for your mock data? Unit tests and regression tests catch one set of bugs, a good statically-typed compiler catches another set. The sets probably intersect, but one is not a superset of the other. "To a large extent, especially in C++ and Java, the way you develop is: [step 1, 2, 3, etc.] So it's this batch cycle, right? 1950s. Submit your punch cards, please." I'm sure he didn't mean this as a serious argument, just a jab, but seriously, you could say the same thing about the scientific method. It's step-by-step a batch cycle too. Static Typing's Paper Tigers: "[Static Typing is a talisman that "keeps real tigers away". And I'm proving this by pointing out examples of big production systems written in dynamically-typed languages (While forgetting that VB and VB.NET code both supports and typically makes appropriate use of static typing)]" Ok, so you *can* make big production systems in dynamically-typed languages. So what? You *can* also do it in Perl or Assembly. I don't think anyone disputes that. You can build a whole house using a coin to drive in all your screws. But that doesn't turn screwdrivers into proverbial tiger-dispelling talismans. The question is: During the course of those programs' development (and maintenance), how much time, effort and money did they spend chasing after things that a good statically-typed language would have immediately caught/prevented? Oh, is it *those* things that are the proverbial tigers? So just because big production systems *have* been made using those languages, that automatically implies that the developers *didn't* ever come across those problems and have to spend their time overcoming them? I did agree with Steve on one thing though: "What are the odds that XML's going to wind up being less verbose than *anything*?"
One more thing: I also take issue with Steve's implication (somewhere in that post, I can't find it in that haystack now), that you need VMs for runtime reflection. Umm...If debugging symbols/type-info can be injected into the executable and read/interpreted by a debugger at runtime, then they can me made readable by the program itself at runtime. (In fact, isn't there already a D library that enables runtime reflection by doing just that?) And if there's any needed info that's not in the injected debugging symbols, what's to stop the compiler/language-definition from just sticking it in the vtable, or something else akin to a vtable?
Jun 15 2008
prev sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Nick Sabalausky wrote:
 But then, the whole idea of VMs being better for language interop is 
 preposterous anyway. After all, how do VMs work? You take a 
 high-level-language, compile it down to a sequence of pre-defined binary 
 opcodes, and execute. Hey! Just like a real CPU! So if you can solve 
 language interop on a VM, you can do the same thing to solve it for native 
 code.
By that argument, anything that a VM can do, native code should be able to do. This is kind of true, but to get some of those things (i.e. hot-swapping, security management, selective dynamic loading) working, you almost need to implement a mini-VM.
Jun 15 2008
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Robert Fraser" <fraserofthenight gmail.com> wrote in message 
news:g346g3$ou7$1 digitalmars.com...
 Nick Sabalausky wrote:
 But then, the whole idea of VMs being better for language interop is 
 preposterous anyway. After all, how do VMs work? You take a 
 high-level-language, compile it down to a sequence of pre-defined binary 
 opcodes, and execute. Hey! Just like a real CPU! So if you can solve 
 language interop on a VM, you can do the same thing to solve it for 
 native code.
By that argument, anything that a VM can do, native code should be able to do. This is kind of true, but to get some of those things (i.e. hot-swapping, security management, selective dynamic loading) working, you almost need to implement a mini-VM.
True, but I guess what I was trying to say was "How do VMs work from the perspective of language interop?" From the security perspective, for instance, there are differences (With a VM, you can sanbox whatever you want, however you want, without requiring a physical CPU that supports the appropriate security features.) But for language interop it all just comes down to "standard ABI" regardless of whether it's a VM's machine code or a real CPU's machine code.
Jun 15 2008
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Robert Fraser wrote:
 By that argument, anything that a VM can do, native code should be able 
 to do. This is kind of true, but to get some of those things (i.e. 
 hot-swapping, security management, selective dynamic loading) working, 
 you almost need to implement a mini-VM.
Heck, there's nothing stopping one from writing a CPU instruction set emulator and create a 'VM' to do it. There are only a couple hundred of rather simple instructions needed to be emulated.
Jun 15 2008
prev sibling next sibling parent reply David Jeske <davidj gmail.com> writes:
Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences 
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.) 
It seems that security/verifiability, and ease of executing on an unknown target processor are the two major benefits of a VM. However, you might be interested in looking at software based fault isolation if you have not seen it. It may make you reconsider how much you need a VM to implement code security. There is a pretty simple explanation here: http://www.cs.unm.edu/~riesen/prop/node16.html
Jun 16 2008
parent reply "Nick Sabalausky" <a a.a> writes:
"David Jeske" <davidj gmail.com> wrote in message 
news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)
It seems that security/verifiability, and ease of executing on an unknown target processor are the two major benefits of a VM. However, you might be interested in looking at software based fault isolation if you have not seen it. It may make you reconsider how much you need a VM to implement code security. There is a pretty simple explanation here: http://www.cs.unm.edu/~riesen/prop/node16.html
Thanks. Interesting read. Although expanding *every* write/jump/(and maybe read) from one instruction each into five instructions each kinda makes me cringe (But maybe it wouldn't need to be a 1-to-5 on every single write/jump after some sort of optimizing-compiler-style magic?). I know that paper claims an overhead of only 4.3% (I wish it had a link to an online copy of the benchmark tests/results), but it was written ten years ago and, as I understand it, pipelining and cache concerns make a far larger speed difference today than they did back then. And, while I'm no x86 asm expert, what they're proposing strikes me as something that might be rather pipeline/cache-unfriendly. Plus, maybe this has changed in recent years, but back when I was doing x86 asm (also about ten or so years ago), the x86 had *very* few general-purpose registers. Like 4 or 5, IIRC. If that's still the case, that would just make performance worse since the 5-6 extra registers this paper suggests would turn into additional memory access (And I imagine they'd be cache-killing accesses). I'm not sure that they mean by i860, though, Intel-something-or-other probably, but I assume i860 isn't the same as i86/x86. Granted, I know performance is a secondary, at best, concern for the types of situations where you would want a sandbox. But, I can't help thinking about rasterized drawing, video decompression, and other things Flash does, and wonder what Flash would be like if the browser placed the flash plugin (I mean the actual browser plugin, not an SWF) into this style of sandbox. Of course, VMs have overhead too (though I doubt Flash's rendering is done in a VM), but I'm not up-to-date enough on all the modern VM implementation details to know how a modern VM's overhead would compare to this. Maybe I'm just confused, but I wonder if a just-in-time-compiled VM would have the potential to be faster than this, simply because the VM's bytecode (presumably) has no way of expressing unsafe behaviors, and therefore anything translated by the VM itself from that "safe" bytecode to real native code would not need those extra runtime checks. (Hmm, kinda weird to think of a VM potentially being *faster* than native code for something).
Jun 17 2008
next sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
Nick Sabalausky wrote:
 "David Jeske" <davidj gmail.com> wrote in message 
 news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)
It seems that security/verifiability, and ease of executing on an unknown target processor are the two major benefits of a VM. However, you might be interested in looking at software based fault isolation if you have not seen it. It may make you reconsider how much you need a VM to implement code security. There is a pretty simple explanation here: http://www.cs.unm.edu/~riesen/prop/node16.html
Thanks. Interesting read. Although expanding *every* write/jump/(and maybe read) from one instruction each into five instructions each kinda makes me cringe (But maybe it wouldn't need to be a 1-to-5 on every single write/jump after some sort of optimizing-compiler-style magic?). I know that paper claims an overhead of only 4.3% (I wish it had a link to an online copy of the benchmark tests/results), but it was written ten years ago and, as I understand it, pipelining and cache concerns make a far larger speed difference today than they did back then. And, while I'm no x86 asm expert, what they're proposing strikes me as something that might be rather pipeline/cache-unfriendly. Plus, maybe this has changed in recent years, but back when I was doing x86 asm (also about ten or so years ago), the x86 had *very* few general-purpose registers. Like 4 or 5, IIRC. If that's still the case, that would just make performance worse since the 5-6 extra registers this paper suggests would turn into additional memory access (And I imagine they'd be cache-killing accesses). I'm not sure that they mean by i860, though, Intel-something-or-other probably, but I assume i860 isn't the same as i86/x86. Granted, I know performance is a secondary, at best, concern for the types of situations where you would want a sandbox. But, I can't help thinking about rasterized drawing, video decompression, and other things Flash does, and wonder what Flash would be like if the browser placed the flash plugin (I mean the actual browser plugin, not an SWF) into this style of sandbox. Of course, VMs have overhead too (though I doubt Flash's rendering is done in a VM), but I'm not up-to-date enough on all the modern VM implementation details to know how a modern VM's overhead would compare to this. Maybe I'm just confused, but I wonder if a just-in-time-compiled VM would have the potential to be faster than this, simply because the VM's bytecode (presumably) has no way of expressing unsafe behaviors, and therefore anything translated by the VM itself from that "safe" bytecode to real native code would not need those extra runtime checks. (Hmm, kinda weird to think of a VM potentially being *faster* than native code for something).
could you explain please why there's a need for a sandbox in the first-place? I think that security should be enforced by the OS. On windows, I see the need for external means of security like a VM since the OS doesn't do security (Microsoft's sense of the word is to annoy the end user with a message box, requiring him to press OK several times...) But on other OSes that seems unnecessary since the OS provides ways to manage security for code. linux has se-linux and there are newer OSes developed with the concept of capabilities. so, unless I'm on windows, what are the benefits of a VM that I won't get directly from the OS? --Yigal
Jun 17 2008
parent reply Georg Wrede <georg nospam.org> writes:
Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?
OS security protects the system and the other users from you. A sandbox protects you yourself from code that's run "as you". (That is, protects your files, etc.)
Jun 18 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?
OS security protects the system and the other users from you. A sandbox protects you yourself from code that's run "as you". (That is, protects your files, etc.)
I disagree. OS security can and does protect the user's files from code that's run "as the user" <-this is a bad concept. current OSes use ACLs (windows, linux, etc..) and there's nothing stopping you from defining a file to be read only, or non-executable to protect data, and the current practice is to define "users" for deamons in order to protect data. that's why apache runs with user www-data with its own ACL rules. you can achieve perfect security with this scheme if you invest enough time to create a separate "user" for each process. as an example, I can run my browser as a different limited user or use a browser which runs inside a sandbox. I can get the same protection from both but the sandbox solution has more overhead. it's easy to see all the problems with manually defining ACLs. Newer OSes based on the concept of "capabilities" remove all those problems. such OSes give processes defined capabilities unrelated to any concept of a user (the concept of users is defined on top of the capabilities mechanism). Capabilities are basically the same as OOP - simplified example: currently OSes are written in a procedural way, there are global data structures and global system calls. i.e. you print to screen via Stdout(text); in D which just calls in the end the appropriate syscall. in a capabilities based OS, there is no such global syscalls/functions. you need to hold an output instance (a handle in the OS - a Capability) in order to call its print method. only if the process has that instance it can print to the screen. security is implemented via the explicit passing of such instances. so if the program received an output instance, it received the right to print to the screen. No sandboxes/VMs/any other emulation layer is needed. --Yigal
Jun 19 2008
next sibling parent Georg Wrede <georg nospam.org> writes:
Yigal Chripun wrote:
 Georg Wrede wrote:
Yigal Chripun wrote:

could you explain please why there's a need for a sandbox in the
first-place?
OS security protects the system and the other users from you. A sandbox protects you yourself from code that's run "as you". (That is, protects your files, etc.)
I disagree. OS security can and does protect the user's files from code that's run "as the user" <-this is a bad concept.
If the code that gets run "as the user" is malicious, and there are no additional guards, then the code could chmod any read-only file you have and then edit it, according to its malicious goals. In practice, these additional guards constitute the Sand Box.
 current OSes use ACLs (windows, linux, etc..) and there's nothing
 stopping you from defining a file to be read only, or non-executable to
 protect data, and the current practice is to define "users" for deamons
 in order to protect data.
Not on my servers, they don't. I rely solely on user/group stuff. And I find it adequate.
 that's why apache runs with user www-data with
  its own ACL rules.
Apache has run as "www-data" or whatever, since the beginning of time, and that has been because it is natural and "obvious" to give the WWW server its own identity.
 you can achieve perfect security with this scheme if
 you invest enough time to create a separate "user" for each process. 
If this were so simple, then we'd have no issue with this entire subject -- for the last 5 years. To put it another way, if the WWW server could run every user's code as a separate OS user, then of course things would be different. But the average Unix (Linux, etc) only has 16 bits of information to identify the "user". And sites like Google have users in the Billions. So, it's not a viable option.
 as an example, I can run my browser as a different limited user or use a
 browser which runs inside a sandbox. I can get the same protection from
 both but the sandbox solution has more overhead.
Server and client problems should be kept separate in one's mind set.
 it's easy to see all the problems with manually defining ACLs.
 Newer OSes based on the concept of "capabilities" remove all those
 problems.
"All those problems". You've been listening to marketing talk.
 such OSes give processes defined capabilities unrelated to any
 concept of a user (the concept of users is defined on top of the
 capabilities mechanism).
I was the Oracle DB Head Administrator in the early '90s at a local University. The concept of Roles was introduced then by Oracle. I actually got pretty excited about this. Instead of Mary, Jo-Anne, and Jane all having their respective read, write and update rights, I could define Roles (which is pretty near the Capabilities concept), so that Updater of Student Credits, Updater of Student Addresses, Updater of Class Information, etc. could all be defined, and when any of the girls went on holidays, I could simply assign the Role to the back-up person, instead of spending days on fixing read-update-write rights for individual table columns and/or views.
 Capabilities are basically the same as OOP - simplified example:
 currently OSes are written in a procedural way, there are global data
 structures and global system calls. i.e. you print to screen via
 Stdout(text); in D which just calls in the end the appropriate syscall.
 in a capabilities based OS, there is no such global syscalls/functions.
...
 No sandboxes/VMs/any other emulation layer is needed.
Gee, nice. Still, D has to relate to what's going on today.
Jun 19 2008
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Yigal Chripun" <yigal100 gmail.com> wrote in message 
news:g3ekcq$82e$1 digitalmars.com...
 Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?
OS security protects the system and the other users from you. A sandbox protects you yourself from code that's run "as you". (That is, protects your files, etc.)
I disagree. OS security can and does protect the user's files from code that's run "as the user" <-this is a bad concept. current OSes use ACLs (windows, linux, etc..) and there's nothing stopping you from defining a file to be read only, or non-executable to protect data, and the current practice is to define "users" for deamons in order to protect data. that's why apache runs with user www-data with its own ACL rules. you can achieve perfect security with this scheme if you invest enough time to create a separate "user" for each process. as an example, I can run my browser as a different limited user or use a browser which runs inside a sandbox. I can get the same protection from both but the sandbox solution has more overhead. it's easy to see all the problems with manually defining ACLs. Newer OSes based on the concept of "capabilities" remove all those problems. such OSes give processes defined capabilities unrelated to any concept of a user (the concept of users is defined on top of the capabilities mechanism). Capabilities are basically the same as OOP - simplified example: currently OSes are written in a procedural way, there are global data structures and global system calls. i.e. you print to screen via Stdout(text); in D which just calls in the end the appropriate syscall. in a capabilities based OS, there is no such global syscalls/functions. you need to hold an output instance (a handle in the OS - a Capability) in order to call its print method. only if the process has that instance it can print to the screen. security is implemented via the explicit passing of such instances. so if the program received an output instance, it received the right to print to the screen. No sandboxes/VMs/any other emulation layer is needed. --Yigal
If I understand all this right, it sounds like this is how it works within the context of browser plugins and applets embedded in a webpage (ie, something like Flash/Java Applet/ActiveX): Old (current) way: A browser is run as user X. Thus, the browser can do anything user X can do. A browser plugin, by its nature, can do anything the browser can do (read locally stored webpages, read/write cookies cache and browser history, delete everything in /home/userX, etc). So to prevent a malicious webpage from embedding something that...well, acts maliciously, there are two options: 1. The browser plugin has to *be* a sandboxing platform like Flash or Java Applets, but unlike ActiveX. This plugin/platform is trusted to not expose unsafe things to the applets running inside of it. 2. (Better) The browser sets up a special limited-rights user for plugins/applets (or optionally, one for each plugin/applet, for finer-grained control). The plugin/applet is run as this limited rights user. New way: Sounds like basically the same thing except replace "user X" with "a few OS handles", and "browser creates 'browserPlugin' user" with "browser selectively passes its own OS handles to the plugins as it sees fit"? And I suppose you configure the OS to grant/disallow these handles in more or less the same way user rights are currently granted? Except they're granted to programs in addition to/instead of users? And I'd assume you'd still need some sort of ACLs so a program can't just go, "Aha! I need to open/save files, so I got a 'write file' handle and a 'read file' handle! Now I can use those handles to read all of the person's private data and overwrite the system files with pictures of potatoes!"
Jun 19 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Nick Sabalausky wrote:
 "Yigal Chripun" <yigal100 gmail.com> wrote in message 
 news:g3ekcq$82e$1 digitalmars.com...
 Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?
OS security protects the system and the other users from you. A sandbox protects you yourself from code that's run "as you". (That is, protects your files, etc.)
I disagree. OS security can and does protect the user's files from code that's run "as the user" <-this is a bad concept. current OSes use ACLs (windows, linux, etc..) and there's nothing stopping you from defining a file to be read only, or non-executable to protect data, and the current practice is to define "users" for deamons in order to protect data. that's why apache runs with user www-data with its own ACL rules. you can achieve perfect security with this scheme if you invest enough time to create a separate "user" for each process. as an example, I can run my browser as a different limited user or use a browser which runs inside a sandbox. I can get the same protection from both but the sandbox solution has more overhead. it's easy to see all the problems with manually defining ACLs. Newer OSes based on the concept of "capabilities" remove all those problems. such OSes give processes defined capabilities unrelated to any concept of a user (the concept of users is defined on top of the capabilities mechanism). Capabilities are basically the same as OOP - simplified example: currently OSes are written in a procedural way, there are global data structures and global system calls. i.e. you print to screen via Stdout(text); in D which just calls in the end the appropriate syscall. in a capabilities based OS, there is no such global syscalls/functions. you need to hold an output instance (a handle in the OS - a Capability) in order to call its print method. only if the process has that instance it can print to the screen. security is implemented via the explicit passing of such instances. so if the program received an output instance, it received the right to print to the screen. No sandboxes/VMs/any other emulation layer is needed. --Yigal
If I understand all this right, it sounds like this is how it works within the context of browser plugins and applets embedded in a webpage (ie, something like Flash/Java Applet/ActiveX): Old (current) way: A browser is run as user X. Thus, the browser can do anything user X can do. A browser plugin, by its nature, can do anything the browser can do (read locally stored webpages, read/write cookies cache and browser history, delete everything in /home/userX, etc). So to prevent a malicious webpage from embedding something that...well, acts maliciously, there are two options: 1. The browser plugin has to *be* a sandboxing platform like Flash or Java Applets, but unlike ActiveX. This plugin/platform is trusted to not expose unsafe things to the applets running inside of it. 2. (Better) The browser sets up a special limited-rights user for plugins/applets (or optionally, one for each plugin/applet, for finer-grained control). The plugin/applet is run as this limited rights user.
yep
 New way:
 Sounds like basically the same thing except replace "user X" with "a few OS 
 handles", and "browser creates 'browserPlugin' user" with "browser 
 selectively passes its own OS handles to the plugins as it sees fit"?
not exactly the same thing. the major difference is this: with users/roles/groups/rules/ACLs/etc.. the security is separate from the code. see explanation below.
 
 And I suppose you configure the OS to grant/disallow these handles in more 
 or less the same way user rights are currently granted? Except they're 
 granted to programs in addition to/instead of users? And I'd assume you'd 
 still need some sort of ACLs so a program can't just go, "Aha! I need to 
 open/save files, so I got a 'write file' handle and a 'read file' handle! 
 Now I can use those handles to read all of the person's private data and 
 overwrite the system files with pictures of potatoes!"
 
the concept of users in the system is implemented /on top/ of the Capabilities mechanisms in the kernel (or actually a micro-kernel to be precise). The Kernel has no concept of users at all, this is all implemented in user space. think of it like this: processes on the system are entities that have certain capabilities. they can exchange those capabilities with each other. users would be implemented in that system also as entities (of a different kind) that can have the same capabilities. This means that security is implemented at a lower level. Here's a snippet from the relevant article on wikipedia: suppose that the user program successfully executes the following statement: int fd = open("/etc/passwd", O_RDWR); The variable fd now contains the index of a file descriptor in the process's file descriptor table. This file descriptor is a capability. Its existence in the process's file descriptor table is sufficient to know that the process does indeed have legitimate access to the object. A key feature of this arrangement is that the file descriptor table is in kernel memory and cannot be directly manipulated by the user program.
Jun 20 2008
parent reply "Nick Sabalausky" <a a.a> writes:
"Yigal Chripun" <yigal100 gmail.com> wrote in message 
news:g3gltb$1b4a$1 digitalmars.com...
 Nick Sabalausky wrote:
 "Yigal Chripun" <yigal100 gmail.com> wrote in message
 news:g3ekcq$82e$1 digitalmars.com...
 Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?
OS security protects the system and the other users from you. A sandbox protects you yourself from code that's run "as you". (That is, protects your files, etc.)
I disagree. OS security can and does protect the user's files from code that's run "as the user" <-this is a bad concept. current OSes use ACLs (windows, linux, etc..) and there's nothing stopping you from defining a file to be read only, or non-executable to protect data, and the current practice is to define "users" for deamons in order to protect data. that's why apache runs with user www-data with its own ACL rules. you can achieve perfect security with this scheme if you invest enough time to create a separate "user" for each process. as an example, I can run my browser as a different limited user or use a browser which runs inside a sandbox. I can get the same protection from both but the sandbox solution has more overhead. it's easy to see all the problems with manually defining ACLs. Newer OSes based on the concept of "capabilities" remove all those problems. such OSes give processes defined capabilities unrelated to any concept of a user (the concept of users is defined on top of the capabilities mechanism). Capabilities are basically the same as OOP - simplified example: currently OSes are written in a procedural way, there are global data structures and global system calls. i.e. you print to screen via Stdout(text); in D which just calls in the end the appropriate syscall. in a capabilities based OS, there is no such global syscalls/functions. you need to hold an output instance (a handle in the OS - a Capability) in order to call its print method. only if the process has that instance it can print to the screen. security is implemented via the explicit passing of such instances. so if the program received an output instance, it received the right to print to the screen. No sandboxes/VMs/any other emulation layer is needed. --Yigal
If I understand all this right, it sounds like this is how it works within the context of browser plugins and applets embedded in a webpage (ie, something like Flash/Java Applet/ActiveX): Old (current) way: A browser is run as user X. Thus, the browser can do anything user X can do. A browser plugin, by its nature, can do anything the browser can do (read locally stored webpages, read/write cookies cache and browser history, delete everything in /home/userX, etc). So to prevent a malicious webpage from embedding something that...well, acts maliciously, there are two options: 1. The browser plugin has to *be* a sandboxing platform like Flash or Java Applets, but unlike ActiveX. This plugin/platform is trusted to not expose unsafe things to the applets running inside of it. 2. (Better) The browser sets up a special limited-rights user for plugins/applets (or optionally, one for each plugin/applet, for finer-grained control). The plugin/applet is run as this limited rights user.
yep
 New way:
 Sounds like basically the same thing except replace "user X" with "a few 
 OS
 handles", and "browser creates 'browserPlugin' user" with "browser
 selectively passes its own OS handles to the plugins as it sees fit"?
not exactly the same thing. the major difference is this: with users/roles/groups/rules/ACLs/etc.. the security is separate from the code. see explanation below.
 And I suppose you configure the OS to grant/disallow these handles in 
 more
 or less the same way user rights are currently granted? Except they're
 granted to programs in addition to/instead of users? And I'd assume you'd
 still need some sort of ACLs so a program can't just go, "Aha! I need to
 open/save files, so I got a 'write file' handle and a 'read file' handle!
 Now I can use those handles to read all of the person's private data and
 overwrite the system files with pictures of potatoes!"
the concept of users in the system is implemented /on top/ of the Capabilities mechanisms in the kernel (or actually a micro-kernel to be precise). The Kernel has no concept of users at all, this is all implemented in user space. think of it like this: processes on the system are entities that have certain capabilities. they can exchange those capabilities with each other. users would be implemented in that system also as entities (of a different kind) that can have the same capabilities. This means that security is implemented at a lower level. Here's a snippet from the relevant article on wikipedia: suppose that the user program successfully executes the following statement: int fd = open("/etc/passwd", O_RDWR); The variable fd now contains the index of a file descriptor in the process's file descriptor table. This file descriptor is a capability. Its existence in the process's file descriptor table is sufficient to know that the process does indeed have legitimate access to the object. A key feature of this arrangement is that the file descriptor table is in kernel memory and cannot be directly manipulated by the user program.
Ok, I think I'm starting to get it, but I'm still a little fuzzy on some stuff (for clarity I'm going to use the terms "human user" and "OS user" to disambiguate what I mean by "the user". Human user of course being the actual person, and os user being the os's concept of a user): Suppose I'm writing a hex editor for one of these capabilities-based OSes. I've got to be able to read/write various files. Non-capabilities way: The human user runs my app. The app is run as either OS user "userX", or some special OS user that the program was configured to run as. The human user tells my app, "Open file 'fileX' (passed by filename)". My app then says to the OS, "Open file X for me, based on the credentials of whatever OS user I'm being run as". The OS then looks up the ACL info and grants/denies access accordingly. Capabilities way: The human user runs my app. The human's OS user object (name of this object is 'userX') is passed to my app kinda like a command line paramater would be. The human user tells my app, "Open file 'fileX' (passed by filename)". My app then *doesn't* talk to the OS, but instead goes to userX, "userX.openfile(fileX, whateverAccessLevel)". If the OS user has that capability and is willing to give it to my app, then it returns the appropriate capability. If the OS user doesn't have that capability then it requests it from whatever its authority is (who? The OS?), just like how the app requested it from the OS user. Somehow, the OS user's authority (if it's successfully able to retreive access from its authority) decides whether or not to allow userX access (I still can only imagine this part involves come sort of ACL or ACL-equivilent).
Jun 20 2008
parent Yigal Chripun <yigal100 gmail.com> writes:
Nick Sabalausky wrote:
 "Yigal Chripun" <yigal100 gmail.com> wrote in message 
 news:g3gltb$1b4a$1 digitalmars.com...
 Nick Sabalausky wrote:
 "Yigal Chripun" <yigal100 gmail.com> wrote in message
 news:g3ekcq$82e$1 digitalmars.com...
 Georg Wrede wrote:
 Yigal Chripun wrote:
 could you explain please why there's a need for a sandbox in the
 first-place?
OS security protects the system and the other users from you. A sandbox protects you yourself from code that's run "as you". (That is, protects your files, etc.)
I disagree. OS security can and does protect the user's files from code that's run "as the user" <-this is a bad concept. current OSes use ACLs (windows, linux, etc..) and there's nothing stopping you from defining a file to be read only, or non-executable to protect data, and the current practice is to define "users" for deamons in order to protect data. that's why apache runs with user www-data with its own ACL rules. you can achieve perfect security with this scheme if you invest enough time to create a separate "user" for each process. as an example, I can run my browser as a different limited user or use a browser which runs inside a sandbox. I can get the same protection from both but the sandbox solution has more overhead. it's easy to see all the problems with manually defining ACLs. Newer OSes based on the concept of "capabilities" remove all those problems. such OSes give processes defined capabilities unrelated to any concept of a user (the concept of users is defined on top of the capabilities mechanism). Capabilities are basically the same as OOP - simplified example: currently OSes are written in a procedural way, there are global data structures and global system calls. i.e. you print to screen via Stdout(text); in D which just calls in the end the appropriate syscall. in a capabilities based OS, there is no such global syscalls/functions. you need to hold an output instance (a handle in the OS - a Capability) in order to call its print method. only if the process has that instance it can print to the screen. security is implemented via the explicit passing of such instances. so if the program received an output instance, it received the right to print to the screen. No sandboxes/VMs/any other emulation layer is needed. --Yigal
If I understand all this right, it sounds like this is how it works within the context of browser plugins and applets embedded in a webpage (ie, something like Flash/Java Applet/ActiveX): Old (current) way: A browser is run as user X. Thus, the browser can do anything user X can do. A browser plugin, by its nature, can do anything the browser can do (read locally stored webpages, read/write cookies cache and browser history, delete everything in /home/userX, etc). So to prevent a malicious webpage from embedding something that...well, acts maliciously, there are two options: 1. The browser plugin has to *be* a sandboxing platform like Flash or Java Applets, but unlike ActiveX. This plugin/platform is trusted to not expose unsafe things to the applets running inside of it. 2. (Better) The browser sets up a special limited-rights user for plugins/applets (or optionally, one for each plugin/applet, for finer-grained control). The plugin/applet is run as this limited rights user.
yep
 New way:
 Sounds like basically the same thing except replace "user X" with "a few 
 OS
 handles", and "browser creates 'browserPlugin' user" with "browser
 selectively passes its own OS handles to the plugins as it sees fit"?
not exactly the same thing. the major difference is this: with users/roles/groups/rules/ACLs/etc.. the security is separate from the code. see explanation below.
 And I suppose you configure the OS to grant/disallow these handles in 
 more
 or less the same way user rights are currently granted? Except they're
 granted to programs in addition to/instead of users? And I'd assume you'd
 still need some sort of ACLs so a program can't just go, "Aha! I need to
 open/save files, so I got a 'write file' handle and a 'read file' handle!
 Now I can use those handles to read all of the person's private data and
 overwrite the system files with pictures of potatoes!"
the concept of users in the system is implemented /on top/ of the Capabilities mechanisms in the kernel (or actually a micro-kernel to be precise). The Kernel has no concept of users at all, this is all implemented in user space. think of it like this: processes on the system are entities that have certain capabilities. they can exchange those capabilities with each other. users would be implemented in that system also as entities (of a different kind) that can have the same capabilities. This means that security is implemented at a lower level. Here's a snippet from the relevant article on wikipedia: suppose that the user program successfully executes the following statement: int fd = open("/etc/passwd", O_RDWR); The variable fd now contains the index of a file descriptor in the process's file descriptor table. This file descriptor is a capability. Its existence in the process's file descriptor table is sufficient to know that the process does indeed have legitimate access to the object. A key feature of this arrangement is that the file descriptor table is in kernel memory and cannot be directly manipulated by the user program.
Ok, I think I'm starting to get it, but I'm still a little fuzzy on some stuff (for clarity I'm going to use the terms "human user" and "OS user" to disambiguate what I mean by "the user". Human user of course being the actual person, and os user being the os's concept of a user): Suppose I'm writing a hex editor for one of these capabilities-based OSes. I've got to be able to read/write various files. Non-capabilities way: The human user runs my app. The app is run as either OS user "userX", or some special OS user that the program was configured to run as. The human user tells my app, "Open file 'fileX' (passed by filename)". My app then says to the OS, "Open file X for me, based on the credentials of whatever OS user I'm being run as". The OS then looks up the ACL info and grants/denies access accordingly. Capabilities way: The human user runs my app. The human's OS user object (name of this object is 'userX') is passed to my app kinda like a command line paramater would be. The human user tells my app, "Open file 'fileX' (passed by filename)". My app then *doesn't* talk to the OS, but instead goes to userX, "userX.openfile(fileX, whateverAccessLevel)". If the OS user has that capability and is willing to give it to my app, then it returns the appropriate capability. If the OS user doesn't have that capability then it requests it from whatever its authority is (who? The OS?), just like how the app requested it from the OS user. Somehow, the OS user's authority (if it's successfully able to retreive access from its authority) decides whether or not to allow userX access (I still can only imagine this part involves come sort of ACL or ACL-equivilent).
first a wikipedia link: http://en.wikipedia.org/wiki/Capabilities now let's try to figure the second way: let's define a system with 3 users: root, userA, userB. from a security POV those are 3 entities with caps [I'll use that inteas of typing "cappabilities"]. userA has all caps [view, edit, accessGUI] for image file A, and a play cap for audio file B. when the real person runs a gimp process on file A as userA the gimp process receives the user's caps on creation. the gimp process can use the edit cap to edit file A, and show the changes on screen via the accessGUI cap. in your example think of userX as a list of caps. when your human user runs the app it runs it with app.run(listOfCapsForApp) {that's pseudo code} the app doesn't know or care about the user. it has a list of caps. on a *nix system you can include header files and call syscalls which are just global functions provided by the OS. with caps, the headers define something like classes [which in turn need to receive a different set of caps to create an instance of themselves]. your app can get a File cap (which is an OS object instance) that define the relavant syscalls as methods on that object. so instead of: openfile(fileX, whateverAccessLevel) you get a fileX from the user and do a filex.open() you do not pass the accessLevel to Open, since the fact you have that fileX reference in your app implies you already have the needed access level to use its methods. instead of separate access rules, think of it like this: if you do not have a fileX object than you cannot do anything on that file. if you have a fileX object, you can use all it's methods. if you have an invariant fileX (if I use the D terminology) than you can only use the invariant methods. The system has a predefined set of caps when booted. you can use those caps to create more caps and pass those to different processes, and so on.. the UI can translate user actions to caps - for example: you can think of opening a file in a dialog in your editor app as [implicitly] providing the process a cap to that file. if you have only read-only access to the file (as defined in your OS user) than you can only give the process a read-only cap. one last note: caps allow you to limit behavior of processes - even if your MP3 file contains a virus that deletes all your files, you can safely play it since the player only has a cap to play that file, and cannot run arbitrary code on the system (even if you run the player as root!) since the player process doesn't need any concept of a user to work and it only cares for the caps it currently has.
Jun 20 2008
prev sibling parent reply Don <nospam nospam.com.au> writes:
Nick Sabalausky wrote:
 "David Jeske" <davidj gmail.com> wrote in message 
 news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)
It seems that security/verifiability, and ease of executing on an unknown target processor are the two major benefits of a VM. However, you might be interested in looking at software based fault isolation if you have not seen it. It may make you reconsider how much you need a VM to implement code security. There is a pretty simple explanation here: http://www.cs.unm.edu/~riesen/prop/node16.html
Thanks. Interesting read. Although expanding *every* write/jump/(and maybe read) from one instruction each into five instructions each kinda makes me cringe (But maybe it wouldn't need to be a 1-to-5 on every single write/jump after some sort of optimizing-compiler-style magic?). I know that paper claims an overhead of only 4.3% (I wish it had a link to an online copy of the benchmark tests/results), but it was written ten years ago and, as I understand it, pipelining and cache concerns make a far larger speed difference today than they did back then. And, while I'm no x86 asm expert, what they're proposing strikes me as something that might be rather pipeline/cache-unfriendly.
It's quite unnecessary on an x86. The x86 has page protection implemented in hardware. It's impossible to write to any memory which the OS hasn't explicitly given you. The problem occurs when the OS has buggy APIs which have exposed too much...
 
 Plus, maybe this has changed in recent years, but back when I was doing x86 
 asm (also about ten or so years ago), the x86 had *very* few general-purpose 
 registers. Like 4 or 5, IIRC. If that's still the case, that would just make 
 performance worse since the 5-6 extra registers this paper suggests would 
 turn into additional memory access (And I imagine they'd be cache-killing 
 accesses). I'm not sure that they mean by i860, though, 
 Intel-something-or-other probably, but I assume i860 isn't the same as 
 i86/x86.
It was an old Intel CPU.
 
 Granted, I know performance is a secondary, at best, concern for the types 
 of situations where you would want a sandbox. But, I can't help thinking 
 about rasterized drawing, video decompression, and other things Flash does, 
 and wonder what Flash would be like if the browser placed the flash plugin 
 (I mean the actual browser plugin, not an SWF) into this style of sandbox.
 
 Of course, VMs have overhead too (though I doubt Flash's rendering is done 
 in a VM), but I'm not up-to-date enough on all the modern VM implementation 
 details to know how a modern VM's overhead would compare to this. Maybe I'm 
 just confused, but I wonder if a just-in-time-compiled VM would have the 
 potential to be faster than this, simply because the VM's bytecode 
 (presumably) has no way of expressing unsafe behaviors, and therefore 
 anything translated by the VM itself from that "safe" bytecode to real 
 native code would not need those extra runtime checks. (Hmm, kinda weird to 
 think of a VM potentially being *faster* than native code for something).
 
 
Jun 17 2008
parent reply "Nick Sabalausky" <a a.a> writes:
"Don" <nospam nospam.com.au> wrote in message 
news:g37vm8$114c$1 digitalmars.com...
 Nick Sabalausky wrote:
 "David Jeske" <davidj gmail.com> wrote in message 
 news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)
It seems that security/verifiability, and ease of executing on an unknown target processor are the two major benefits of a VM. However, you might be interested in looking at software based fault isolation if you have not seen it. It may make you reconsider how much you need a VM to implement code security. There is a pretty simple explanation here: http://www.cs.unm.edu/~riesen/prop/node16.html
Thanks. Interesting read. Although expanding *every* write/jump/(and maybe read) from one instruction each into five instructions each kinda makes me cringe (But maybe it wouldn't need to be a 1-to-5 on every single write/jump after some sort of optimizing-compiler-style magic?). I know that paper claims an overhead of only 4.3% (I wish it had a link to an online copy of the benchmark tests/results), but it was written ten years ago and, as I understand it, pipelining and cache concerns make a far larger speed difference today than they did back then. And, while I'm no x86 asm expert, what they're proposing strikes me as something that might be rather pipeline/cache-unfriendly.
It's quite unnecessary on an x86. The x86 has page protection implemented in hardware. It's impossible to write to any memory which the OS hasn't explicitly given you. The problem occurs when the OS has buggy APIs which have exposed too much...
What's the difference between that x86 page protection and whatever that new feature is (something about process protection I think?) that CPUs have just been starting to get? (boy, I'm out of the loop on this stuff)
Jun 17 2008
parent reply Don <nospam nospam.com.au> writes:
Nick Sabalausky wrote:
 "Don" <nospam nospam.com.au> wrote in message 
 news:g37vm8$114c$1 digitalmars.com...
 Nick Sabalausky wrote:
 "David Jeske" <davidj gmail.com> wrote in message 
 news:g37coj$2q9u$1 digitalmars.com...
 Nick Sabalausky Wrote:
 ... From the security perspective, for instance, there are differences
 (With a VM, you can sanbox whatever you want, however you want,
 without requiring a physical CPU that supports the appropriate security
 features.)
It seems that security/verifiability, and ease of executing on an unknown target processor are the two major benefits of a VM. However, you might be interested in looking at software based fault isolation if you have not seen it. It may make you reconsider how much you need a VM to implement code security. There is a pretty simple explanation here: http://www.cs.unm.edu/~riesen/prop/node16.html
Thanks. Interesting read. Although expanding *every* write/jump/(and maybe read) from one instruction each into five instructions each kinda makes me cringe (But maybe it wouldn't need to be a 1-to-5 on every single write/jump after some sort of optimizing-compiler-style magic?). I know that paper claims an overhead of only 4.3% (I wish it had a link to an online copy of the benchmark tests/results), but it was written ten years ago and, as I understand it, pipelining and cache concerns make a far larger speed difference today than they did back then. And, while I'm no x86 asm expert, what they're proposing strikes me as something that might be rather pipeline/cache-unfriendly.
It's quite unnecessary on an x86. The x86 has page protection implemented in hardware. It's impossible to write to any memory which the OS hasn't explicitly given you. The problem occurs when the OS has buggy APIs which have exposed too much...
What's the difference between that x86 page protection and whatever that new feature is (something about process protection I think?) that CPUs have just been starting to get? (boy, I'm out of the loop on this stuff)
The page protection is implemented by the OS, and only applies to user apps, not kernel drivers. From reading the AMD64 System Programming manual, it seems that the 'secure virtual machine' feature is roughly the same thing, except at an even deeper level: it prevents the OS kernel from accessing specific areas of memory or I/O. So it even allows you to sandbox the kernel (!)
Jun 17 2008
parent Georg Wrede <georg nospam.org> writes:
Don wrote:
 Nick Sabalausky wrote:
 
 What's the difference between that x86 page protection and whatever 
 that new feature is (something about process protection I think?) that 
 CPUs have just been starting to get?  (boy, I'm out of the loop on 
 this stuff) 
The page protection is implemented by the OS, and only applies to user apps, not kernel drivers. From reading the AMD64 System Programming manual, it seems that the 'secure virtual machine' feature is roughly the same thing, except at an even deeper level: it prevents the OS kernel from accessing specific areas of memory or I/O. So it even allows you to sandbox the kernel (!)
Gawhhhhh. But seriously, that is the way to let you run virtual machines where there could be several kernels, possibly of several operating systems. So, when processors evolve, and operating systems increasingly take advantage of the features of the existing processors, having the /next/ processor generation have yet another level of "priority" guarantees that the operating systems for the previous processor can all be virtualised with 100% accuracy, 100% efficiency, and 100% security. Without this it would be virtually (no pun intended) impossible. --- Now, with the majority of operating systems today (at least most Linuxes are compiled with the 386 as the target while it's about 5 years since "anybody ever" has tried to run Linux on a 386 -- dunno about Windows, but I assume most Windows versions are theoretically runnable on a 386, too), this would not be a priority. Actually it is a matter of Prudent Development. The only way you (as a processor manufacturer) can literally guarantee that the previous processor can be fully virtualised, is to add yet another layer of privilege.
Jun 18 2008
prev sibling parent reply Georg Wrede <georg nospam.org> writes:
PatrickD wrote:
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html
I've read a number of his previous rants, and I've generally found them interesting, informative, and thought provoking. Sometimes even entertaining. This one was an exception.
 <Steve Yegge> He told me the other day, [talking about] one of my
 blog rants, that he didn't agree with the point that I'd made that
 virtual machines are "obvious". You know? I mean, of course you use a
 virtual machine!
 
 But he's a compiler dude, and he says they're a sham, they're a
 farce, "I don't get it!" And so I explained it [my viewpoint] to him,
 and he went: Ohhhhhhh. </Steve Yegge>
The above story, and the fact that he goes on ranting irrespective of the slide sequence, the fact that he blatantly generalizes, derides, self-promotes, the fact that the transcript includes the superfluous interjections from spoken language, that he cavalierly exaggerates, and some other details -- collectively lead me to think he's, ehhh, in an "accelerated state of mind". There are basically a few different ways to achieve that state of mind (of which I'm not suggesting any, I'm merely enumerating the most usual ones here). - A speed trip. - The upper phase of bipolar syndrome. - A state of mind that can be deliberately achieved through repeated self-assertion and self-excitement. - The pre-onset stage of a nervous breakdown. - A basically "God, I'm good" mindset. Now, I'm not saying it's any of these, but it sure looks like it. But enough of that. --- What especially made me glad here was (yet another) "subliminal" advertisement for D. At least we get exposure. PS, well, most of his blog readers are intelligent-wannabes, professional programmer-wannabes, and the readership of his blogs must be immense. The more controversial the blog, the more readers you get. A littel like tabloids, where you know that "The President escapes death" on the front page translates to "he almost stepped on a bee that could have stung him". Just take them with a truckload of salt. But D being mentioned there gets a lot of eyeballs. And the D related stuff was written with obvious respect for D! One really couldn't ask for anything better.
Jun 18 2008
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Georg Wrede wrote:
 PatrickD wrote:
 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html
I've read a number of his previous rants, and I've generally found them interesting, informative, and thought provoking. Sometimes even entertaining. This one was an exception.
 <Steve Yegge> He told me the other day, [talking about] one of my
 blog rants, that he didn't agree with the point that I'd made that
 virtual machines are "obvious". You know? I mean, of course you use a
 virtual machine!

 But he's a compiler dude, and he says they're a sham, they're a
 farce, "I don't get it!" And so I explained it [my viewpoint] to him,
 and he went: Ohhhhhhh. </Steve Yegge>
The above story, and the fact that he goes on ranting irrespective of the slide sequence, the fact that he blatantly generalizes, derides, self-promotes, the fact that the transcript includes the superfluous interjections from spoken language, that he cavalierly exaggerates, and some other details -- collectively lead me to think he's, ehhh, in an "accelerated state of mind".
My guess is he's more accustomed to writing blogs than presenting them as talks. So he was probably just a little hyped up on stage fright. I think he makes some very good points in that talk, and they mostly make sense if you start with the premise that all software can and should be delivered over the web. He's working for Google, and before that at Amazon, so it's not surprising that his world view is skewed in that web-centric direction. So I think he's just forgetting (or deliberately ignoring) the fact that someone still has to write that VM and the operating system it runs on, and those better run as fast as possible or no one will care how wonderfully "dynamic" it is. --bb
Jun 18 2008
parent reply Georg Wrede <georg nospam.org> writes:
Bill Baxter wrote:
 Georg Wrede wrote:
 PatrickD wrote:

 http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html
I've read a number of his previous rants, and I've generally found them interesting, informative, and thought provoking. Sometimes even entertaining. This one was an exception.
 <Steve Yegge> He told me the other day, [talking about] one of my
 blog rants, that he didn't agree with the point that I'd made that
 virtual machines are "obvious". You know? I mean, of course you use a
 virtual machine!

 But he's a compiler dude, and he says they're a sham, they're a
 farce, "I don't get it!" And so I explained it [my viewpoint] to him,
 and he went: Ohhhhhhh. </Steve Yegge>
The above story, and the fact that he goes on ranting irrespective of the slide sequence, the fact that he blatantly generalizes, derides, self-promotes, the fact that the transcript includes the superfluous interjections from spoken language, that he cavalierly exaggerates, and some other details -- collectively lead me to think he's, ehhh, in an "accelerated state of mind".
My guess is he's more accustomed to writing blogs than presenting them as talks. So he was probably just a little hyped up on stage fright.
Let's hope it. (OT: it took me more than the promised 20 minutes to read the stuff. Very much more. I guess I'm a slow reader.) :-)
 I think he makes some very good points in that talk, and they mostly make 
 sense if you start with the premise that all software can and should be 
 delivered over the web.  He's working for Google, and before that at 
 Amazon, so it's not surprising that his world view is skewed in that 
 web-centric direction.
 
 So I think he's just forgetting (or deliberately ignoring) the fact that 
 someone still has to write that VM and the operating system it runs on, 
 and those better run as fast as possible or no one will care how 
 wonderfully "dynamic" it is.
Considering that all the languages he talks about still have to be /compiled/ for the VM (JIT or no JIT), I have a hard time seeing the case for VMs being rock-solid and compelling. Think about it. If I have a web site where I let viewers run their own code on my server, I could simply provide them with a rigged D compiler. The compiler would (or a preprocessor, it would actually be easier for me) flag no-nos in their source code as errors. No biggie. Or I might sandbox the running user binaries. Of course I'd should also enforce per user quotas (or per user code snippet), and such. They could even write to the hard disk, and an easy way would be to have a virtual filesystem in a file. --- And then there's the choice nobody seems to suggest: running a VM that uses the processor's own ASM as the VM language. The (e.g. D) compiler would enforce the exclusion of dangerous idioms. Sure, this is more work than I'd personally care to do, but for some big company this should be a reasonable alternative. --- Hmm. On second thought, there /is/ one case for the VM. And that is, the choice of languages. The bunch of languages he is talking about, I guess, are more suited for this kind of "user-tinkering" than "Real Languages", like D. At least some of them are somewhat usable with hardly any programming experience. But that's definitely a language choice issue, and not a VM/no VM issue in itself.
Jun 19 2008
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Georg Wrede wrote:

 So I think he's just forgetting (or deliberately ignoring) the fact 
 that someone still has to write that VM and the operating system it 
 runs on, and those better run as fast as possible or no one will care 
 how wonderfully "dynamic" it is.
Considering that all the languages he talks about still have to be /compiled/ for the VM (JIT or no JIT), I have a hard time seeing the case for VMs being rock-solid and compelling. Think about it. If I have a web site where I let viewers run their own code on my server, I could simply provide them with a rigged D compiler. The compiler would (or a preprocessor, it would actually be easier for me) flag no-nos in their source code as errors. No biggie. Or I might sandbox the running user binaries.
Or you could just use Java's VM instead of trying to figure out how to make all that work. I think that's a big part of it. The Java VM works and is available today, so for people like Steve it makes sense to use it.
 ---
 
 And then there's the choice nobody seems to suggest: running a VM that 
 uses the processor's own ASM as the VM language. The (e.g. D) compiler 
 would enforce the exclusion of dangerous idioms.
That's kinda what the "virtual appliance" thing is about isn't it? Running an app inside a VMWare instance with some ASM as the VM's native tongue. --bb
Jun 19 2008
parent reply Georg Wrede <georg nospam.org> writes:
Bill Baxter wrote:
 Georg Wrede wrote:
 
 So I think he's just forgetting (or deliberately ignoring) the fact 
 that someone still has to write that VM and the operating system it 
 runs on, and those better run as fast as possible or no one will care 
 how wonderfully "dynamic" it is.
Considering that all the languages he talks about still have to be /compiled/ for the VM (JIT or no JIT), I have a hard time seeing the case for VMs being rock-solid and compelling. Think about it. If I have a web site where I let viewers run their own code on my server, I could simply provide them with a rigged D compiler. The compiler would (or a preprocessor, it would actually be easier for me) flag no-nos in their source code as errors. No biggie. Or I might sandbox the running user binaries.
Or you could just use Java's VM instead of trying to figure out how to make all that work. I think that's a big part of it. The Java VM works and is available today, so for people like Steve it makes sense to use it.
Hmm. I originally took it like he's promoting the VM as /itself/ having properties that make it the superior and Obvious choice. But maybe it's all simply about the Java VM bein easy ubiquitous and mature.
 And then there's the choice nobody seems to suggest: running a VM that 
 uses the processor's own ASM as the VM language. The (e.g. D) compiler 
 would enforce the exclusion of dangerous idioms.
That's kinda what the "virtual appliance" thing is about isn't it? Running an app inside a VMWare instance with some ASM as the VM's native tongue.
Pretty close.
Jun 19 2008
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Georg Wrede" <georg nospam.org> wrote in message 
news:485A582F.6050103 nospam.org...
 Or you could just use Java's VM instead of trying to figure out how to 
 make all that work.  I think that's a big part of it.  The Java VM works 
 and is available today, so for people like Steve it makes sense to use 
 it.
Hmm. I originally took it like he's promoting the VM as /itself/ having properties that make it the superior and Obvious choice. But maybe it's all simply about the Java VM bein easy ubiquitous and mature.
I could be wrong, but I got the impression that he, like a lot of VM-proponents (but not all!), aren't considering those to be two separate concepts. That is, I suspect they might be confusing "the way things currently are" (ie, "VMs like the JVM have mature sandboxing and runtime reflection today, and such things aren't currently in non-VMs") with "the only way things can be" (ie, "You can't have things like sandboxing and runtime reflection without a VM").
Jun 19 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Georg Wrede wrote:
 Hmm. I originally took it like he's promoting the VM as /itself/ having 
 properties that make it the superior and Obvious choice. But maybe it's 
 all simply about the Java VM bein easy ubiquitous and mature.
I see the advantage of a VM as being if you're inventing a new language, you don't have to bother writing an optimizer, code generator, or linker. Of course, LLVM should make that advantage moot as well.
Jun 19 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:
 I see the advantage of a VM as being if you're inventing a new language, 
 you don't have to bother writing an optimizer, code generator, or linker.
And often GC, part of the standard library, some/most external modules, DBMS interfaces, GUI widgets, etc, too :-) That's why creating a language like Boo on the dotnet was doable by a single person in few months. Bye, bearophile
Jun 20 2008