digitalmars.D - Question about GCC / GDC / LDC syntax of inline asm advanced
- Cecil Ward (18/18) Jul 06 2023 In GDC and LDC’s inline asm syntax, the main asm part is
- Cecil Ward (3/10) Jul 06 2023 I would have put this question in the GDC or LDC sections, but it
- Iain Buclaw (17/28) Jul 07 2023 The first part is parsed as an
- Iain Buclaw (2/18) Jul 07 2023 As shown in compiler explorer: https://d.godbolt.org/z/eTqvW8o8E
- Cecil Ward (5/26) Jul 07 2023 Many thanks Iain. Much appreciated. I just had a go at checking
- Cecil Ward (7/38) Jul 07 2023 Many thanks Iain. I am not doing a full parse so I shall simply
- Cecil Ward (3/11) Jul 07 2023 Iain, what’s the lex syntax for a label in the asm ? is it always
- Iain Buclaw (9/21) Jul 07 2023 Where is the label being referenced/defined from?
- Iain Buclaw (6/24) Jul 07 2023 Saying that, I'm pretty sure GNU As accepts Unicode in symbol
- Cecil Ward (5/27) Jul 07 2023 I’m thinking it’s within the asm string. Not one mentioned in the
- Cecil Ward (14/45) Jul 07 2023 My kludge is to say that if a label is at the start of a line
In GDC and LDC’s inline asm syntax, the main asm part is separated from the constraints block by the first colon that begins the ‘outputs’ section. My question: how does GDC / LDC / GCC parse the first part, given that there can be umpteen kinds of assembler language. Is the parsing asm dialect-specific so that a full parse finds the first significant colon ? If not and the very first colon (outside double-quoted strings and comments) ends the first section, which is how I parse it, then there is a problem, as labels contain colons. And so I have a bug in my gramma for my kludge asm section parser, see thread elsewhere. About labels then, is a label the only place a non section-terminator colon can occur? A label doesn’t have a colon before the name, but after it, is that correct for ATT asm dialect? If so, I could check for a newline plus optional whitespace required before any colon if it is to be recognised as a section terminator and the beginning of the constraints.
Jul 06 2023
On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:In GDC and LDC’s inline asm syntax, the main asm part is separated from the constraints block by the first colon that begins the ‘outputs’ section. My question: how does GDC / LDC / GCC parse the first part, given that there can be umpteen kinds of assembler language. Is the parsing asm dialect-specific so that a full parse finds the first significant colon ? [...]I would have put this question in the GDC or LDC sections, but it applies to both.
Jul 06 2023
On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:In GDC and LDC’s inline asm syntax, the main asm part is separated from the constraints block by the first colon that begins the ‘outputs’ section. My question: how does GDC / LDC / GCC parse the first part, given that there can be umpteen kinds of assembler language. Is the parsing asm dialect-specific so that a full parse finds the first significant colon ?The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have: ``` asm { (test ? "if-true-insn" : "if-false-insn") ~ buildAsmString(foo, bar) ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here : output-constraints : ... } ``` It's only at semantic-time that a "string-literal" result is enforced using CTFE.If not and the very first colon (outside double-quoted strings and comments) ends the first section, which is how I parse it, then there is a problem, as labels contain colons. And so I have a bug in my gramma for my kludge asm section parser, see thread elsewhere.Labels are statements, so there shouldn't be any conflict between the two.
Jul 07 2023
On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:As shown in compiler explorer: https://d.godbolt.org/z/eTqvW8o8E[...]The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have: ``` asm { (test ? "if-true-insn" : "if-false-insn") ~ buildAsmString(foo, bar) ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here : output-constraints : ... } ``` It's only at semantic-time that a "string-literal" result is enforced using CTFE.
Jul 07 2023
On Friday, 7 July 2023 at 12:12:15 UTC, Iain Buclaw wrote:On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:Many thanks Iain. Much appreciated. I just had a go at checking for the ? : expression with a simple state machine. As I mentioned, I’m not doing a proper parse here, not by a million miles, just the bare minimum to get it to work.On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:As shown in compiler explorer: https://d.godbolt.org/z/eTqvW8o8E[...]The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have: ``` asm { (test ? "if-true-insn" : "if-false-insn") ~ buildAsmString(foo, bar) ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here : output-constraints : ... } ``` It's only at semantic-time that a "string-literal" result is enforced using CTFE.
Jul 07 2023
On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.In GDC and LDC’s inline asm syntax, the main asm part is separated from the constraints block by the first colon that begins the ‘outputs’ section. My question: how does GDC / LDC / GCC parse the first part, given that there can be umpteen kinds of assembler language. Is the parsing asm dialect-specific so that a full parse finds the first significant colon ?The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have: ``` asm { (test ? "if-true-insn" : "if-false-insn") ~ buildAsmString(foo, bar) ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here : output-constraints : ... } ``` It's only at semantic-time that a "string-literal" result is enforced using CTFE.If not and the very first colon (outside double-quoted strings and comments) ends the first section, which is how I parse it, then there is a problem, as labels contain colons. And so I have a bug in my gramma for my kludge asm section parser, see thread elsewhere.Labels are statements, so there shouldn't be any conflict between the two.
Jul 07 2023
On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?[...]Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Jul 07 2023
On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:Where is the label being referenced/defined from? Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html): [A-Za-z._][0-9A-Za-z._]+ For a label referenced from an asm statement (goto labels section)? Then it's the same as any other D [Identifier](https://dlang.org/spec/lex.html#Identifier) - same as above, but also includes unicode alpha characters.On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?[...]Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Jul 07 2023
On Friday, 7 July 2023 at 19:45:19 UTC, Iain Buclaw wrote:On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:Saying that, I'm pretty sure GNU As accepts Unicode in symbol names, as I have encountered reports of testsuite failures on Solaris 10/11 that involved Oracle's assembler and Unicode in D function names (gcc and gccgo rather encode the unicode characters in a symbol name so IIRC).On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:Where is the label being referenced/defined from? Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html): [A-Za-z._][0-9A-Za-z._]+On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?[...]Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Jul 07 2023
On Friday, 7 July 2023 at 19:45:19 UTC, Iain Buclaw wrote:On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:I’m thinking it’s within the asm string. Not one mentioned in the labels section. I’m trying to remember the syntax of local labels from the days of my youth when I was a full-time pro asm programmer, various different machines back then.On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:Where is the label being referenced/defined from? Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html): [A-Za-z._][0-9A-Za-z._]+ For a label referenced from an asm statement (goto labels section)? Then it's the same as any other D [Identifier](https://dlang.org/spec/lex.html#Identifier) - same as above, but also includes unicode alpha characters.On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?[...]Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Jul 07 2023
On Friday, 7 July 2023 at 21:58:20 UTC, Cecil Ward wrote:On Friday, 7 July 2023 at 19:45:19 UTC, Iain Buclaw wrote:My kludge is to say that if a label is at the start of a line (bar any whitespace preceding it) then there must be at least one non-whitespace character before the colon. The section delimiter colon that is the start of the first constraints block is at the start of the line, so I test whether or not (apart from whitespace) a colon is seen at the start of a line, and that then distinguishes between a label and a ‘global’ end of main asm section marker. I also now check for "? :" expressions in an extremely shoddy fashion, even handling nested ternary expressions by ‘?’-counting and I can now see that a ‘:’ is not a main section terminator when it is in a ‘? :’. So I’m knocking off the cases as you and I find them. In every case, doing the minimum, not with a full expression grammar.On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:I’m thinking it’s within the asm string. Not one mentioned in the labels section. I’m trying to remember the syntax of local labels from the days of my youth when I was a full-time pro asm programmer, various different machines back then.On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:Where is the label being referenced/defined from? Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html): [A-Za-z._][0-9A-Za-z._]+ For a label referenced from an asm statement (goto labels section)? Then it's the same as any other D [Identifier](https://dlang.org/spec/lex.html#Identifier) - same as above, but also includes unicode alpha characters.On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?[...]Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
Jul 07 2023