www.digitalmars.com         C & C++   DMDScript  

digitalmars.dip.ideas - Mixin C

reply Paul Backus <snarwin gmail.com> writes:
What if instead of importing C files like D modules, we could 
write bits of C code directly in the middle of our D code, like 
we do with inline ASM?

It might look something like this:

```d
void main()
{
     mixin(C) {
         #include <stdio.h>
         printf("Hello from C!\n");
     }
}
```

Here's how it could work:

1. The compiler takes the content of the `mixin(C)` block and 
passes it through the external C preprocessor.
2. The result of (1) is parsed as a C AST fragment using the 
ImportC parser.
3. The result of (2) is spliced into the AST in place of the 
`mixin(C)` block, and undergoes semantic analysis using ImportC 
semantics.

Mixin C would solve two big issues with the current ImportC 
approach: the poor preprocessor support, and the conflicts 
between `.c` and `.d` files in the compiler's import paths.

Because Mixin C runs the preprocessor at the point of *usage* 
rather than the point of *definition*, it allows you to make full 
use of C APIs that rely on the preprocessor, without having to 
translate macros to D (either automatically or by hand).

Because Mixin C blocks appear as code fragments inside `.d` 
files, rather than as separate `.c` files, you'll never have to 
worry about accidentally importing a C file when you meant to 
import a D module.

By the way, if you did want to treat a `.c` or `.h` file like its 
own module, you'd still be able to do so with Mixin C. Just write 
a simple `.d` wrapper, like this:

```d
module libwhatever;

mixin(C) {
     #include <libwhatever.h>
}
```

I haven't spent much time fleshing out the details of this idea, 
but it seems pretty promising. What do you guys think?
Mar 07
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
ImportC is fundamentally a compiler specific extension.

Having raw C blocks in a D file gives me concern for IDE's wrt. syntax 
highlighting.

We do have a comparable feature with inline assembly support in ldc/gdc 
where it uses strings.

I would suggest that this is the direction to go in rather than a raw 
code block.

It's a good idea that I do think is the right approach to the problem.
Mar 07
parent reply Paul Backus <snarwin gmail.com> writes:
On Friday, 8 March 2024 at 03:37:02 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Having raw C blocks in a D file gives me concern for IDE's wrt. 
 syntax highlighting.

 We do have a comparable feature with inline assembly support in 
 ldc/gdc where it uses strings.

 I would suggest that this is the direction to go in rather than 
 a raw code block.
Yeah that's reasonable. Honestly it would barely even look different if you use a q{...} string: mixin(C) q{ #include <stdio.h> printf("Hello from C!\n"); }; I guess the lexer currently barfs on #include, but surely we can bend the rules on that if we need to.
Mar 07
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 08/03/2024 5:02 PM, Paul Backus wrote:
 On Friday, 8 March 2024 at 03:37:02 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 Having raw C blocks in a D file gives me concern for IDE's wrt. syntax 
 highlighting.

 We do have a comparable feature with inline assembly support in 
 ldc/gdc where it uses strings.

 I would suggest that this is the direction to go in rather than a raw 
 code block.
Yeah that's reasonable. Honestly it would barely even look different if you use a q{...} string:     mixin(C) q{         #include <stdio.h>         printf("Hello from C!\n");     }; I guess the lexer currently barfs on #include, but surely we can bend the rules on that if we need to.
I was thinking about a new kind of string, one that produces a struct that has both a language string that the user provided as well as the contents. That way editors can syntax highlight if they understand it, or ignore it if they don't. Might be overkill, but it does have some interesting possibilities.
Mar 07
prev sibling next sibling parent zjh <fqbqrr 163.com> writes:
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:
 What if instead of importing C files like D modules, we could 
 write bits of C code directly in the middle of our D code, like 
 we do with inline ASM?
Why not create a separate file `name extension`, such as `'dc'`,then like this: ```d //aa.dc #include <stdio.h> printf("Hello from C!\n"); //b.d import aa; ```
Mar 07
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:
 ```d
 module libwhatever;

 mixin(C) {
     #include <libwhatever.h>
 }
 ```

 I haven't spent much time fleshing out the details of this 
 idea, but it seems pretty promising. What do you guys think?
The first thing that comes to mind is that it'll allow doing things inline that would otherwise be impossible in D. ```d import std.stdio; extern(C) int func(int x); mixin(C) { // gdc supports these asm-declarations via gcc (ldc should too via clang) asm(" .globl func .type func, function func: .cfi_startproc movl %edi, %eax addl $1, %eax ret .cfi_endproc "); } int main() { int n = func(72); // mixin(C) not necessary as gdc+ldc support this natively with asm{""::;} mixin(C) { asm ("leal (%0,%0,4),%0" : "=r" (n) : "0" (n)); } writeln("73*5 = ", n); // 73*5 = 365 // ditto, asm{""}, but this is purely for presentation. mixin(C) { asm ("movq $60, %rax\n" "movq $2, %rdi\n" "syscall"); } assert(0); } ```
Mar 08
prev sibling next sibling parent reply Lance Bachmeier <no spam.net> writes:
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:

 I haven't spent much time fleshing out the details of this 
 idea, but it seems pretty promising. What do you guys think?
Translation of C macros to D can sometimes [give incorrect behavior](https://github.com/dlang/dmd/pull/16199). With the current implementation, it does that silently, and that obviously raises questions about the reliability of the final product. My proposal is to allow the user to compile C code inside unit tests so there's at least a chance of catching bugs. What you are proposing would make that possible.
Mar 08
next sibling parent Lance Bachmeier <no spam.net> writes:
On Friday, 8 March 2024 at 14:39:46 UTC, Lance Bachmeier wrote:
 On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:

 I haven't spent much time fleshing out the details of this 
 idea, but it seems pretty promising. What do you guys think?
Translation of C macros to D can sometimes [give incorrect behavior](https://github.com/dlang/dmd/pull/16199). With the current implementation, it does that silently, and that obviously raises questions about the reliability of the final product. My proposal is to allow the user to compile C code inside unit tests so there's at least a chance of catching bugs. What you are proposing would make that possible.
And I realize you have addressed the macro thing too, but I think this is separately a valid use case.
Mar 08
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
You can add unit tests for C code already:

```
import myccode; // #define cmacro(a) 2

unittest
{
     assert(cmacro(1) == 2);
}
```
Mar 29
parent bachmeier <no spam.net> writes:
On Friday, 29 March 2024 at 07:21:43 UTC, Walter Bright wrote:
 You can add unit tests for C code already:

 ```
 import myccode; // #define cmacro(a) 2

 unittest
 {
     assert(cmacro(1) == 2);
 }
 ```
The problem raised in that discussion has to do with things like ``` #define DOUBLE(x) (x) + (x) DOUBLE(i++); ``` The output of DOUBLE isn't the problem, it's the part below it, which you referred to as metaprogramming. You can only test that by running the preprocessor on both lines. Currently, you'd have to create a new C file and add it to your project in order to put it in a unittest. Now that some time has passed, where I've used this feature with at least 100,000 lines of C code, I'm not as concerned about it.
Mar 29
prev sibling next sibling parent reply Tim <tim.dlang t-online.de> writes:
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:
 What if instead of importing C files like D modules, we could 
 write bits of C code directly in the middle of our D code, like 
 we do with inline ASM?
Are multiple mixin(C) blocks evaluated in the same context? Symbols and macros from one mixin(C) block could then be used in another mixin(C) block, like in this example: ```D mixin(C) { #include <stdio.h> }; void main() { mixin(C) { printf("test\n"); } } ``` The preprocessor call for the second block would need to know all macros from the first call. Can code in mixin(C) statements access local variables from D? How would name conflicts be resolved when an identifier exists both in the current module and a C header file? In the following example `BUFSIZ` is both a local variable and a macro from a C header: ```D void main() { int BUFSIZ = 5; mixin(C) { #include <stdio.h> printf("BUFSIZ = %d\n", BUFSIZ); } } ``` Are variables declared in mixin(C) statements interpreted as global or local variables? ```D void main() { mixin(C) { #include <stdio.h> fprintf(stderr, "test\n"); } } ``` The header declares variable `stderr`. If this is now a local variable, because the header is included inside a function, it could cause problems. Maybe this could be solved by treating C variables marked with `extern` as globals.
Mar 08
parent Paul Backus <snarwin gmail.com> writes:
On Friday, 8 March 2024 at 17:06:21 UTC, Tim wrote:
 Are multiple mixin(C) blocks evaluated in the same context? 
 Symbols and macros from one mixin(C) block could then be used 
 in another mixin(C) block, like in this example:
 ```D
 mixin(C) {
 #include <stdio.h>
 };
 void main()
 {
     mixin(C) {
         printf("test\n");
     }
 }
 ```
 The preprocessor call for the second block would need to know 
 all macros from the first call.
Each block is preprocessed separately, and the result of preprocessing is then evaluated (type-checked and compiled) in the context of the enclosing D scope. So, symbols are visible across different blocks, but preprocessor macros are not. Your example would work, because it would expand to this: ```d extern(C) int printf(const(char)* format, ...); // Other definitions from stdio.h ... void main() { printf("test\n"); } ``` But this example would not work: ```d mixin(C) { #include <stdio.h> #define info(s) printf("[info] %s\n", s); } void main() { mixin(C) { info("test"); // error - undefined identifier 'info' } } ```
 Can code in mixin(C) statements access local variables from D? 
 How would name conflicts be resolved when an identifier exists 
 both in the current module and a C header file? In the 
 following example `BUFSIZ` is both a local variable and a macro 
 from a C header:
 ```D
 void main()
 {
     int BUFSIZ = 5;
     mixin(C) {
         #include <stdio.h>
         printf("BUFSIZ = %d\n", BUFSIZ);
     }
 }
 ```
Name lookup in mixin(C) blocks would follow the normal D scoping rules. In this example, since BUFSIZ is a macro, it would be expanded by the preprocessor before the D compiler even parses the C code, and the value from `stdio.h` would be printed. If BUFSIZ were a variable instead of a macro, then you would get a compile-time error for defining two variables with the same name in the same scope.
 Are variables declared in mixin(C) statements interpreted as 
 global or local variables?
 ```D
 void main()
 {
     mixin(C) {
         #include <stdio.h>
         fprintf(stderr, "test\n");
     }
 }
 ```
 The header declares variable `stderr`. If this is now a local 
 variable, because the header is included inside a function, it 
 could cause problems. Maybe this could be solved by treating C 
 variables marked with `extern` as globals.
I believe the C standard actually requires such variables to be treated as globals. The relevant sections are [6.2.2 Linkages of identifiers][1] and [6.2.4 Storage durations of objects][2]. So, assuming the D compiler implements the C standard correctly, this should Just Work.
Mar 08
prev sibling next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:
 By the way, if you did want to treat a `.c` or `.h` file like 
 its own module, you'd still be able to do so with Mixin C. Just 
 write a simple `.d` wrapper, like this:

 ```d
 module libwhatever;

 mixin(C) {
     #include <libwhatever.h>
 }
 ```
This also makes it trivial to apply function attributes to C declarations. For example, if you want everything from `libwhatever.h` to be `nothrow` and ` nogc`, you just write this: ```d nothrow nogc mixin(C) { #include <libwhatever.h> } ``` There's currently no way to do this with ImportC, and even if there were, it would likely require modifying the header file (for example, with the `#pragma` suggested in [issue 23812][1]). [1]: https://issues.dlang.org/show_bug.cgi?id=23812
Mar 08
next sibling parent Daniel N <no public.email> writes:
On Friday, 8 March 2024 at 18:03:47 UTC, Paul Backus wrote:
 ```d
 nothrow  nogc
 mixin(C) {
     #include <libwhatever.h>
 }
 ```

 There's currently no way to do this with ImportC, and even if 
 there were, it would likely require modifying the header file 
 (for example, with the `#pragma` suggested in [issue 23812][1]).

 [1]: https://issues.dlang.org/show_bug.cgi?id=23812
wow, that is cool!
Mar 09
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/8/2024 10:03 AM, Paul Backus wrote:
 There's currently no way to do this with ImportC
``` __declspec(nothrow) int foo(); ```
Mar 29
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:
 What if instead of importing C files like D modules, we could 
 write bits of C code directly in the middle of our D code, like 
 we do with inline ASM?

 It might look something like this:

 ```d
 void main()
 {
     mixin(C) {
         #include <stdio.h>
         printf("Hello from C!\n");
     }
 }
 ```

 Here's how it could work:

 1. The compiler takes the content of the `mixin(C)` block and 
 passes it through the external C preprocessor.
 2. The result of (1) is parsed as a C AST fragment using the 
 ImportC parser.
 3. The result of (2) is spliced into the AST in place of the 
 `mixin(C)` block, and undergoes semantic analysis using ImportC 
 semantics.
So the entirety of `stdio.h` is included in the body of the D main function? Is that wise? The other problem is if you want to use C expressions in D. For example, let's say you have the C definition: ```c #define PI 3.14159 ``` How can I use this in D land? I could assign it to a variable maybe? ```d mixin(C) { #include "pidef.h" double PI_ = PI; } ``` Note, I have to use a new name. And it has to be a variable, because that's all you can do in C. What if I wanted it to be an enum? Too bad, C doesn't support that. What I'd like to see is: a) the C preprocessor is run on *all the mixin(C) islands of the file* regardless of where they appear, whether they are in templates, etc. Basically, take all the mixin(C) things and concatenate them, run the result through the preprocessor, and put the results back where they were. THEN run the importC compiler on them. This allows a more cohesive C-like experience, without having to import/define things over and over. b) Allow mixin(C) expressions, such as `enum PI = mixin(C) { PI }`. Maybe this was already the intention? But I didn't get that vibe from the proposal. -Steve
Mar 22
parent reply Paul Backus <snarwin gmail.com> writes:
On Saturday, 23 March 2024 at 02:51:31 UTC, Steven Schveighoffer 
wrote:
 So the entirety of `stdio.h` is included in the body of the D 
 main function? Is that wise?
In this specific example, it's overkill. In general...is there a better alternative? The C preprocessor is a blunt instrument, and if we want to have full support for it, we are going to have to live with the consequences of that bluntness. With Mixin C, the D programmer at least gets to choose whether they would rather pay the cost of `#include`-ing C headers multiple times, or the cost of translating preprocessor macros by hand.
 The other problem is if you want to use C expressions in D. For 
 example, let's say you have the C definition:

 ```c
 #define PI 3.14159
 ```

 How can I use this in D land? I could assign it to a variable 
 maybe?

 ```d
 mixin(C) {
    #include "pidef.h"
    double PI_ = PI;
 }
 ```

 Note, I have to use a new name. And it has to be a variable, 
 because that's all you can do in C. What if I wanted it to be 
 an enum? Too bad, C doesn't support that.
You can use a lambda: ```d enum PI = () { mixin(C) { #include "pidef.h" return PI; } }(); ```
 What I'd like to see is:

 a) the C preprocessor is run on *all the mixin(C) islands of 
 the file* regardless of where they appear, whether they are in 
 templates, etc. Basically, take all the mixin(C) things and 
 concatenate them, run the result through the preprocessor, and 
 put the results back where they were. THEN run the importC 
 compiler on them. This allows a more cohesive C-like 
 experience, without having to import/define things over and 
 over.
Some downsides to this approach: 1. Concatenating all of the `mixin(C)` blocks in a module for preprocessing violates D's scoping rules and creates a lot of opportunities for "spooky action at a distance." 2. This would allow sharing macro definitions across `mixin(C)` blocks, but would *not* allow sharing declarations. You'd still have to `#include <stdio.h>` twice if you wanted to call `printf` in two different blocks, for example. 3. In order to "put the results back where they were" the D compiler would have to parse the preprocessor's output for [line markers][1]. Since the format of these is not specified by the C standard, this means the D compiler would have to have separate parsers for each C preprocessor implementation (or, at least, one for gcc/clang and one for MSVC). [1]: https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html
Mar 23
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Saturday, 23 March 2024 at 19:02:52 UTC, Paul Backus wrote:
 On Saturday, 23 March 2024 at 02:51:31 UTC, Steven 
 Schveighoffer wrote:
 So the entirety of `stdio.h` is included in the body of the D 
 main function? Is that wise?
In this specific example, it's overkill. In general...is there a better alternative? The C preprocessor is a blunt instrument, and if we want to have full support for it, we are going to have to live with the consequences of that bluntness.
My objection is to the *requirement* that you include it in the main function. It's very different from D nested imports, as it redefines everything inside the function. Having to re-import *everything* everywhere you need to use a macro is really bad.
 You can use a lambda:

 ```d
 enum PI = () {
     mixin(C) {
         #include "pidef.h"
         return PI;
     }
 }();
 ```
Ugh, still having to include the entirety of a C header inside a function context.
 What I'd like to see is:

 a) the C preprocessor is run on *all the mixin(C) islands of 
 the file* regardless of where they appear, whether they are in 
 templates, etc. Basically, take all the mixin(C) things and 
 concatenate them, run the result through the preprocessor, and 
 put the results back where they were. THEN run the importC 
 compiler on them. This allows a more cohesive C-like 
 experience, without having to import/define things over and 
 over.
Some downsides to this approach: 1. Concatenating all of the `mixin(C)` blocks in a module for preprocessing violates D's scoping rules and creates a lot of opportunities for "spooky action at a distance."
But that's what you get with C. For instance, you can #define a macro inside a function, and use it inside another function, as long as it comes later in the file. It's not spooky to C programmers. You can even #undef things or re #define them.
 2. This would allow sharing macro definitions across `mixin(C)` 
 blocks, but would *not* allow sharing declarations. You'd still 
 have to `#include <stdio.h>` twice if you wanted to call 
 `printf` in two different blocks, for example.
but you wouldn't have to include them inside the functions. You get the function definitions and macros in the right place (at module level).
 3. In order to "put the results back where they were" the D 
 compiler would have to parse the preprocessor's output for line 
 markers. Since the format of these is not specified by the C 
 standard, this means the D compiler would have to have separate 
 parsers for each C preprocessor implementation (or, at least, 
 one for gcc/clang and one for MSVC).
I came up with an approach for this, I detailed it in my dconf talk last year. All preprocessors have a flag which preserves comments. But yes, you are right this is a big hacky problem to solve. -Steve
Mar 27
parent Paul Backus <snarwin gmail.com> writes:
On Thursday, 28 March 2024 at 02:24:34 UTC, Steven Schveighoffer 
wrote:
 On Saturday, 23 March 2024 at 19:02:52 UTC, Paul Backus wrote:
 Some downsides to this approach:

 1. Concatenating all of the `mixin(C)` blocks in a module for 
 preprocessing violates D's scoping rules and creates a lot of 
 opportunities for "spooky action at a distance."
But that's what you get with C. For instance, you can #define a macro inside a function, and use it inside another function, as long as it comes later in the file. It's not spooky to C programmers. You can even #undef things or re #define them.
I think most C programmers would regard this as a horrifying abuse of the preprocessor--the kind of thing they switch to D (and other languages) to get away from.
 2. This would allow sharing macro definitions across 
 `mixin(C)` blocks, but would *not* allow sharing declarations. 
 You'd still have to `#include <stdio.h>` twice if you wanted 
 to call `printf` in two different blocks, for example.
but you wouldn't have to include them inside the functions. You get the function definitions and macros in the right place (at module level).
It sounds like the usage pattern you're envisioning is something like this: ```d /// Bindings for libfoo module foo; mixin(C) { #include "libfoo.h" } /// Wraps foo_do_stuff void doStuff(int x) { mixin(C) { // reuses top-level #include foo_do_stuff(x, FOO_SOME_MACRO); } } /// Wraps foo_do_other_stuff void doOtherStuff(const(char)* s) { mixin(C) { // reuses top-level #include foo_do_other_stuff(s, FOO_SOME_OTHER_MACRO); } } ``` I agree that supporting this usage pattern would be desirable, but I'm not sure concatenating every `mixin(C)` block in a module is the best way to do so. Perhaps instead we can have a dedicated "`mixin(C)` header" block that can appear at module scope, whose content is prepended to each `mixin(C)` block? E.g., ```d /// Bindings for libfoo module foo; mixinC_header { #include "foo.h" } void doStuff(int x) { mixin(C) { // #include "foo.h" inserted here foo_do_stuff(x, FOO_SOME_MACRO); } } // etc. ``` Granted, this only solves the UX problems of the original proposal, not the performance problems--under the hood you are still doing a bunch of separate preprocessor calls, and including the same file over and over. But, again, you are not *forced* to use Mixin C like this, and programmers who want to optimize their build times will still have alternatives to turn to that do not require opening the door to total macro madness.
Mar 27
prev sibling next sibling parent max haughton <maxhaton gmail.com> writes:
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:
 What if instead of importing C files like D modules, we could 
 write bits of C code directly in the middle of our D code, like 
 we do with inline ASM?

 [...]
https://github.com/dlang/dmd/pull/14114
Mar 28
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/7/2024 7:23 PM, Paul Backus wrote:
 Mixin C would solve two big issues with the current ImportC approach: the poor 
 preprocessor support,
I don't see how it would improve preprocessor support. C macros used in D code won't fare any better than they do now.
 and the conflicts between `.c` and `.d` files in the compiler's import paths.
Frankly, I never understood why that was an issue. Change the name of one of them, or put them in different paths. I appreciate the effort you've put into this. But I have to be blunt. I've seen this before. Here's what it looks like in practice: https://elsmar.com/elsmarqualityforum/media/redneck-car-air-conditioning.1560/ D is a beautiful language to program in. Let's keep it that way! https://www.joemacari.com/stock/ferrari-daytona-spyder/10004904
Mar 29
parent reply Paul Backus <snarwin gmail.com> writes:
On Friday, 29 March 2024 at 07:49:23 UTC, Walter Bright wrote:
 I appreciate the effort you've put into this. But I have to be 
 blunt. I've seen this before. Here's what it looks like in 
 practice:

 https://elsmar.com/elsmarqualityforum/media/redneck-car-air-conditioning.1560/
Message received. I won't spend any more time on this going forward.
Mar 29
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/29/2024 2:22 PM, Paul Backus wrote:
 On Friday, 29 March 2024 at 07:49:23 UTC, Walter Bright wrote:
 I appreciate the effort you've put into this. But I have to be blunt. I've 
 seen this before. Here's what it looks like in practice:

 https://elsmar.com/elsmarqualityforum/media/redneck-car-air-conditioning.1560/
Message received. I won't spend any more time on this going forward.
Your time is valuable and I don't want to waste it.
Mar 29