www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - popcnt usage

reply "Todd VanderVeen" <tdvanderveen gmail.com> writes:
First, let me say thanks for the addition of the popcnt inline 
assembler opcode. I had placed a project on hold until it was 
available. I look forward to using D again.

I determined this instruction was available after some 
experimentation as its not documented on the inline assembler 
page.

uint popcnt (ulong bits) {
    asm {
       mov RAX, bits ;
       popcnt RAX, RAX ;
    }
}

Mention is made in the documentation of SSE4.2 support but I 
understand popcnt and lzcnt aren't really considered part of this 
instruction set as they aren't register based. If I were to 
submit a pull request to address the documentation, how would you 
prefer this is represented, simply as additions to the opcode 
table or annotated that they were implemented alongside SSE4.2? 
Both?

A second concern is whether it is possible to determine the 
availability of this instruction at compile time. I want to do 
something like the following in a custom popcnt method:

version(X86_64) {
    static if (hasPopcnt()) {
       asm {
          ... performant assembly version
       }
    } else {
       ... slower procedural version
    }
}

But the miscellaneous features of core.cpuid are not available 
for conditional compilation. Is there an undocumented version 
label that could be used to this end? Is my only option to pass a 
version flag on the command line?

version(X86_64) {
    version(Has_Popcnt) {
       asm {
          ... performant assembly version
       }
    }
    else {
       ... slower procedural version
    }
}

This is workable, but it would be nice if these finer 
architectural distinctions were available for conditional 
compilation without the need for the extra external configuration.
Dec 23 2013
next sibling parent "Todd VanderVeen" <tdvanderveen gmail.com> writes:
I retract my second concern. I misread a purity error for a CTFE 
error. This does work as expected.

import core.cpuid: hasPopcnt;

/// Returns the number of bits which are set.
uint popcnt(ulong bits) nothrow
{
    version(X86_64) {
       if(hasPopcnt()) {
          asm {
             ....
          }
       }
       else {
          ...
       }
    }
}

Is there any reason that core.cpuid.hasPopcnt() cannot be made 
pure? Hopefully, calling it won't change my processor :)
Dec 23 2013
prev sibling next sibling parent "Todd VanderVeen" <tdvanderveen gmail.com> writes:
Actually, I dropped the static if in my example and traded one 
problem for another.

As the static variables of core.cpuid are not accessible at 
compile time, I would like to emulate the cpu interrogation done 
there, but I see that asm statements are disallowed in CTFE. Is 
versioning the only option here?
Dec 23 2013
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 23 December 2013 16:47, Todd VanderVeen <tdvanderveen gmail.com> wrote:
 First, let me say thanks for the addition of the popcnt inline assembler
 opcode. I had placed a project on hold until it was available. I look
 forward to using D again.

 I determined this instruction was available after some experimentation as
 its not documented on the inline assembler page.

 uint popcnt (ulong bits) {
    asm {
       mov RAX, bits ;
       popcnt RAX, RAX ;
    }
 }

 Mention is made in the documentation of SSE4.2 support but I understand
 popcnt and lzcnt aren't really considered part of this instruction set as
 they aren't register based. If I were to submit a pull request to address
 the documentation, how would you prefer this is represented, simply as
 additions to the opcode table or annotated that they were implemented
 alongside SSE4.2? Both?

 A second concern is whether it is possible to determine the availability of
 this instruction at compile time. I want to do something like the following
 in a custom popcnt method:

 version(X86_64) {
    static if (hasPopcnt()) {
       asm {
          ... performant assembly version
       }
    } else {
       ... slower procedural version
    }
 }

There's no way to do this at compile time, other than assume that D_InlineAsm_X86_64 imples popcnt, or do a runtime check to determine the correct path to take.
Dec 23 2013
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 23 December 2013 20:18, Todd VanderVeen <tdvanderveen gmail.com> wrote:
 Is there any reason that core.cpuid.hasPopcnt() cannot be made pure?
 Hopefully, calling it won't change my processor :)

It may have side effects, or no one thought about making it pure.
Dec 23 2013
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Mon, 23 Dec 2013 23:46:11 +0000
schrieb Iain Buclaw <ibuclaw gdcproject.org>:

 On 23 December 2013 16:47, Todd VanderVeen <tdvanderveen gmail.com> wrote:
 First, let me say thanks for the addition of the popcnt inline assembler
 opcode. I had placed a project on hold until it was available. I look
 forward to using D again.

 I determined this instruction was available after some experimentation as
 its not documented on the inline assembler page.

 uint popcnt (ulong bits) {
    asm {
       mov RAX, bits ;
       popcnt RAX, RAX ;
    }
 }

 Mention is made in the documentation of SSE4.2 support but I understand
 popcnt and lzcnt aren't really considered part of this instruction set as
 they aren't register based. If I were to submit a pull request to address
 the documentation, how would you prefer this is represented, simply as
 additions to the opcode table or annotated that they were implemented
 alongside SSE4.2? Both?

 A second concern is whether it is possible to determine the availability of
 this instruction at compile time. I want to do something like the following
 in a custom popcnt method:

 version(X86_64) {
    static if (hasPopcnt()) {
       asm {
          ... performant assembly version
       }
    } else {
       ... slower procedural version
    }
 }

There's no way to do this at compile time, other than assume that D_InlineAsm_X86_64 imples popcnt, or do a runtime check to determine the correct path to take.

You _could_ export the the target CPU as some built-in enum. Like in the old days where it resulted in Pentium Pro and K6 builds. -- Marco
Dec 23 2013
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Mon, 23 Dec 2013 23:46:11 +0000
schrieb Iain Buclaw <ibuclaw gdcproject.org>:

 On 23 December 2013 16:47, Todd VanderVeen <tdvanderveen gmail.com> wrote:
 First, let me say thanks for the addition of the popcnt inline assembler
 opcode. I had placed a project on hold until it was available. I look
 forward to using D again.

 I determined this instruction was available after some experimentation as
 its not documented on the inline assembler page.

 uint popcnt (ulong bits) {
    asm {
       mov RAX, bits ;
       popcnt RAX, RAX ;
    }
 }

 Mention is made in the documentation of SSE4.2 support but I understand
 popcnt and lzcnt aren't really considered part of this instruction set as
 they aren't register based. If I were to submit a pull request to address
 the documentation, how would you prefer this is represented, simply as
 additions to the opcode table or annotated that they were implemented
 alongside SSE4.2? Both?

 A second concern is whether it is possible to determine the availability of
 this instruction at compile time. I want to do something like the following
 in a custom popcnt method:

 version(X86_64) {
    static if (hasPopcnt()) {
       asm {
          ... performant assembly version
       }
    } else {
       ... slower procedural version
    }
 }

There's no way to do this at compile time, other than assume that D_InlineAsm_X86_64 imples popcnt, or do a runtime check to determine the correct path to take.

Oh and if I remember correctly the popcnt intrinsic in GDC is somewhat slow in emulation mode. No biggie, I just got reminded. -- Marco
Dec 23 2013
prev sibling parent "Kai Nacke" <kai redstar.de> writes:
On Monday, 23 December 2013 at 16:47:32 UTC, Todd VanderVeen 
wrote:
 First, let me say thanks for the addition of the popcnt inline 
 assembler opcode. I had placed a project on hold until it was 
 available. I look forward to using D again.

 I determined this instruction was available after some 
 experimentation as its not documented on the inline assembler 
 page.

 uint popcnt (ulong bits) {
    asm {
       mov RAX, bits ;
       popcnt RAX, RAX ;
    }
 }

 Mention is made in the documentation of SSE4.2 support but I 
 understand popcnt and lzcnt aren't really considered part of 
 this instruction set as they aren't register based. If I were 
 to submit a pull request to address the documentation, how 
 would you prefer this is represented, simply as additions to 
 the opcode table or annotated that they were implemented 
 alongside SSE4.2? Both?

 A second concern is whether it is possible to determine the 
 availability of this instruction at compile time. I want to do 
 something like the following in a custom popcnt method:

 version(X86_64) {
    static if (hasPopcnt()) {
       asm {
          ... performant assembly version
       }
    } else {
       ... slower procedural version
    }
 }

 But the miscellaneous features of core.cpuid are not available 
 for conditional compilation. Is there an undocumented version 
 label that could be used to this end? Is my only option to pass 
 a version flag on the command line?

 version(X86_64) {
    version(Has_Popcnt) {
       asm {
          ... performant assembly version
       }
    }
    else {
       ... slower procedural version
    }
 }

 This is workable, but it would be nice if these finer 
 architectural distinctions were available for conditional 
 compilation without the need for the extra external 
 configuration.

With ldc2 you can use -mattr=+popcnt to use popcnt instruction and -mattr=-popcnt to use the emulation. Regards, Kai
Jan 02 2014