www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Question about GCC / GDC / LDC syntax of inline asm advanced

reply Cecil Ward <cecil cecilward.com> writes:
In GDC and LDC’s inline asm syntax, the main asm part is 
separated from the constraints block by the first colon that 
begins the ‘outputs’ section. My question: how does GDC / LDC / 
GCC parse the first part, given that there can be umpteen kinds 
of assembler language. Is the parsing asm dialect-specific so 
that a full parse finds the first significant colon ?

If not and the very first colon (outside double-quoted strings 
and comments) ends the first section, which is how I parse it, 
then there is a problem, as labels contain colons. And so I have 
a bug in my gramma for my kludge asm section parser, see thread 
elsewhere.

About labels then, is a label the only place a non 
section-terminator colon can occur?

A label doesn’t have a colon before the name, but after it, is 
that correct for ATT asm dialect? If so, I could check for a 
newline plus optional whitespace required before any colon if it 
is to be recognised as a section terminator and the beginning of 
the constraints.
Jul 06 2023
next sibling parent Cecil Ward <cecil cecilward.com> writes:
On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:
 In GDC and LDC’s inline asm syntax, the main asm part is 
 separated from the constraints block by the first colon that 
 begins the ‘outputs’ section. My question: how does GDC / LDC / 
 GCC parse the first part, given that there can be umpteen kinds 
 of assembler language. Is the parsing asm dialect-specific so 
 that a full parse finds the first significant colon ?

 [...]
I would have put this question in the GDC or LDC sections, but it applies to both.
Jul 06 2023
prev sibling parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:
 In GDC and LDC’s inline asm syntax, the main asm part is 
 separated from the constraints block by the first colon that 
 begins the ‘outputs’ section. My question: how does GDC / LDC / 
 GCC parse the first part, given that there can be umpteen kinds 
 of assembler language. Is the parsing asm dialect-specific so 
 that a full parse finds the first significant colon ?
The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have: ``` asm { (test ? "if-true-insn" : "if-false-insn") ~ buildAsmString(foo, bar) ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here : output-constraints : ... } ``` It's only at semantic-time that a "string-literal" result is enforced using CTFE.
 If not and the very first colon (outside double-quoted strings 
 and comments) ends the first section, which is how I parse it, 
 then there is a problem, as labels contain colons. And so I 
 have a bug in my gramma for my kludge asm section parser, see 
 thread elsewhere.
Labels are statements, so there shouldn't be any conflict between the two.
Jul 07 2023
next sibling parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
 On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:
[...]
The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have: ``` asm { (test ? "if-true-insn" : "if-false-insn") ~ buildAsmString(foo, bar) ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here : output-constraints : ... } ``` It's only at semantic-time that a "string-literal" result is enforced using CTFE.
As shown in compiler explorer: https://d.godbolt.org/z/eTqvW8o8E
Jul 07 2023
parent Cecil Ward <cecil cecilward.com> writes:
On Friday, 7 July 2023 at 12:12:15 UTC, Iain Buclaw wrote:
 On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
 On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:
[...]
The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have: ``` asm { (test ? "if-true-insn" : "if-false-insn") ~ buildAsmString(foo, bar) ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here : output-constraints : ... } ``` It's only at semantic-time that a "string-literal" result is enforced using CTFE.
As shown in compiler explorer: https://d.godbolt.org/z/eTqvW8o8E
Many thanks Iain. Much appreciated. I just had a go at checking for the ? : expression with a simple state machine. As I mentioned, I’m not doing a proper parse here, not by a million miles, just the bare minimum to get it to work.
Jul 07 2023
prev sibling parent reply Cecil Ward <cecil cecilward.com> writes:
On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
 On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:
 In GDC and LDC’s inline asm syntax, the main asm part is 
 separated from the constraints block by the first colon that 
 begins the ‘outputs’ section. My question: how does GDC / LDC 
 / GCC parse the first part, given that there can be umpteen 
 kinds of assembler language. Is the parsing asm 
 dialect-specific so that a full parse finds the first 
 significant colon ?
The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have: ``` asm { (test ? "if-true-insn" : "if-false-insn") ~ buildAsmString(foo, bar) ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here : output-constraints : ... } ``` It's only at semantic-time that a "string-literal" result is enforced using CTFE.
 If not and the very first colon (outside double-quoted strings 
 and comments) ends the first section, which is how I parse it, 
 then there is a problem, as labels contain colons. And so I 
 have a bug in my gramma for my kludge asm section parser, see 
 thread elsewhere.
Labels are statements, so there shouldn't be any conflict between the two.
Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Jul 07 2023
parent reply Cecil Ward <cecil cecilward.com> writes:
On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
 [...]
Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?
Jul 07 2023
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
 [...]
Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?
Where is the label being referenced/defined from? Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html): [A-Za-z._][0-9A-Za-z._]+ For a label referenced from an asm statement (goto labels section)? Then it's the same as any other D [Identifier](https://dlang.org/spec/lex.html#Identifier) - same as above, but also includes unicode alpha characters.
Jul 07 2023
next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Friday, 7 July 2023 at 19:45:19 UTC, Iain Buclaw wrote:
 On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
 [...]
Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?
Where is the label being referenced/defined from? Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html): [A-Za-z._][0-9A-Za-z._]+
Saying that, I'm pretty sure GNU As accepts Unicode in symbol names, as I have encountered reports of testsuite failures on Solaris 10/11 that involved Oracle's assembler and Unicode in D function names (gcc and gccgo rather encode the unicode characters in a symbol name so IIRC).
Jul 07 2023
prev sibling parent reply Cecil Ward <cecil cecilward.com> writes:
On Friday, 7 July 2023 at 19:45:19 UTC, Iain Buclaw wrote:
 On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
 [...]
Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?
Where is the label being referenced/defined from? Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html): [A-Za-z._][0-9A-Za-z._]+ For a label referenced from an asm statement (goto labels section)? Then it's the same as any other D [Identifier](https://dlang.org/spec/lex.html#Identifier) - same as above, but also includes unicode alpha characters.
I’m thinking it’s within the asm string. Not one mentioned in the labels section. I’m trying to remember the syntax of local labels from the days of my youth when I was a full-time pro asm programmer, various different machines back then.
Jul 07 2023
parent Cecil Ward <cecil cecilward.com> writes:
On Friday, 7 July 2023 at 21:58:20 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 19:45:19 UTC, Iain Buclaw wrote:
 On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:
 On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
 [...]
Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?
Where is the label being referenced/defined from? Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html): [A-Za-z._][0-9A-Za-z._]+ For a label referenced from an asm statement (goto labels section)? Then it's the same as any other D [Identifier](https://dlang.org/spec/lex.html#Identifier) - same as above, but also includes unicode alpha characters.
I’m thinking it’s within the asm string. Not one mentioned in the labels section. I’m trying to remember the syntax of local labels from the days of my youth when I was a full-time pro asm programmer, various different machines back then.
My kludge is to say that if a label is at the start of a line (bar any whitespace preceding it) then there must be at least one non-whitespace character before the colon. The section delimiter colon that is the start of the first constraints block is at the start of the line, so I test whether or not (apart from whitespace) a colon is seen at the start of a line, and that then distinguishes between a label and a ‘global’ end of main asm section marker. I also now check for "? :" expressions in an extremely shoddy fashion, even handling nested ternary expressions by ‘?’-counting and I can now see that a ‘:’ is not a main section terminator when it is in a ‘? :’. So I’m knocking off the cases as you and I find them. In every case, doing the minimum, not with a full expression grammar.
Jul 07 2023