/proc/cpuinfo vs. processor groups

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

/proc/cpuinfo vs. processor groups

Achim Gratz

As briefly discussed on IRC I've got a new Server 2016 blade with 2
sockets × 8 cores × 2 HT =32 logical processors and Cygwin spews errors
for processor ID 16 and up (also top doesn't quite work, which likely
has the same reason, although the code path may be unrelated to the
/proc/cpuinfo bug described here).

--8<---------------cut here---------------start------------->8---
64bit (166)~ > cat /proc/cpuinfo
      0 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(10000,0 (10/16)) failed Win32 error 87
    209 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(20000,0 (11/17)) failed Win32 error 87
    913 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(40000,0 (12/18)) failed Win32 error 87
   1047 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(80000,0 (13/19)) failed Win32 error 87
   1151 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(100000,0 (14/20)) failed Win32 error 87
   1266 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(200000,0 (15/21)) failed Win32 error 87
   1383 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(400000,0 (16/22)) failed Win32 error 87
   1479 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(800000,0 (17/23)) failed Win32 error 87
   1573 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(1000000,0 (18/24)) failed Win32 error 87
   1675 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(2000000,0 (19/25)) failed Win32 error 87
   1806 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(4000000,0 (1A/26)) failed Win32 error 87
   1888 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(8000000,0 (1B/27)) failed Win32 error 87
   1971 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(10000000,0 (1C/28)) failed Win32 error 87
   2069 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(20000000,0 (1D/29)) failed Win32 error 87
   2154 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(40000000,0 (1E/30)) failed Win32 error 87
   2247 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(80000000,0 (1F/31)) failed Win32 error 87
--8<---------------cut here---------------end--------------->8---

It turns out this is related to processor groups and some changes that
probably weren't even in the making when the Cygwin code was written.
These changes were opt-in patches until 2008R2, but are now the default
in 2016:

https://blogs.msdn.microsoft.com/saponsqlserver/2011/10/08/uneven-windows-processor-groups/

The BIOS on that server does something rather peculiar (it does make
sense in a way, but Cygwin clearly didn't expect it):

https://support.hpe.com/hpsc/doc/public/display?sp4ts.oid=7271227&docId=emr_na-c04650594&docLocale=en_US

This results in Windows coming up with two 64 core processor groups that
have 16 active logical processors each:

--8<---------------cut here---------------start------------->8---
(gdb) print plpi
$1 = (PSYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX) 0x600008020
(gdb) print *plpi
$2 = {Relationship = RelationGroup, Size = 128, {Processor = {Flags = 2 '\002', Reserved = "\000\002", '\000' <repeats 18 times>, GroupCount = 0,
      GroupMask = {{Mask = 4160, Group = 0, Reserved = {0, 0, 0}}}}, NumaNode = {NodeNumber = 131074, Reserved = '\000' <repeats 19 times>,
      GroupMask = {Mask = 4160, Group = 0, Reserved = {0, 0, 0}}}, Cache = {Level = 2 '\002', Associativity = 0 '\000', LineSize = 2, CacheSize = 0,
      Type = CacheUnified, Reserved = '\000' <repeats 12 times>, "@\020\000\000\000\000\000", GroupMask = {Mask = 0, Group = 0, Reserved = {0, 0,
          0}}}, Group = {MaximumGroupCount = 2, ActiveGroupCount = 2, Reserved = '\000' <repeats 19 times>, GroupInfo = {{
          MaximumProcessorCount = 64 '@', ActiveProcessorCount = 16 '\020', Reserved = '\000' <repeats 37 times>, ActiveProcessorMask = 65535}}}}}
--8<---------------cut here---------------end--------------->8---

I've confirmed that the error message is not printed if I manually
correct the information for processor ID 17 as follows:

--8<---------------cut here---------------start------------->8---
(gdb) print affinity                                                                                                                                                            
$2 = {Mask = 131072, Group = 0, Reserved = {0, 0, 0}}                                                                                                                          
(gdb) set affinity.Mask=2                                                                                                                                                      
(gdb) set affinity.Group=1
--8<---------------cut here---------------end--------------->8---

However, the same or possibly even stranger processor group setups can
be created using boot options that force different organizations of
processor groups.  There is an option to force a seperate processor
group for each NUMA node and another one to force a specific number of
groups.  The upshot is that even the first processor groups may not have
the maximum number of processors present, so you need to check the
number of active processors instead.  I couldn't find out if the
processor mask is still guaranteed to be filled from the LSB contigously
or whether one can rely on only the last group to have less than the
first few.  It seems more prudent to check the group specific
ActiveProcessorMask, although that significantly complicates the code.
I don't think Windows can currently switch CPU online/offline when
booted.


As an aside, the cache size is reported as 256kiB (not just for this
processor, but also for a Celeron 1037U on another machine), which seems
to be the L2 cache for a single hardware core on these architectures.
Linux now reports L3 cache sizes (and possibly L4 if present) for these
(20MiB and 2MiB per socket respectively).


Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds
Reply | Threaded
Open this post in threaded view
|

Re: /proc/cpuinfo vs. processor groups

Corinna Vinschen-2
THanks for the report but this belongs to the cygwin ML.
I'm redirecting this here.


Corinna

On Apr 10 18:36, Achim Gratz wrote:

>
> As briefly discussed on IRC I've got a new Server 2016 blade with 2
> sockets × 8 cores × 2 HT =32 logical processors and Cygwin spews errors
> for processor ID 16 and up (also top doesn't quite work, which likely
> has the same reason, although the code path may be unrelated to the
> /proc/cpuinfo bug described here).
>
> --8<---------------cut here---------------start------------->8---
> 64bit (166)~ > cat /proc/cpuinfo
>       0 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(10000,0 (10/16)) failed Win32 error 87
>     209 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(20000,0 (11/17)) failed Win32 error 87
>     913 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(40000,0 (12/18)) failed Win32 error 87
>    1047 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(80000,0 (13/19)) failed Win32 error 87
>    1151 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(100000,0 (14/20)) failed Win32 error 87
>    1266 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(200000,0 (15/21)) failed Win32 error 87
>    1383 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(400000,0 (16/22)) failed Win32 error 87
>    1479 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(800000,0 (17/23)) failed Win32 error 87
>    1573 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(1000000,0 (18/24)) failed Win32 error 87
>    1675 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(2000000,0 (19/25)) failed Win32 error 87
>    1806 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(4000000,0 (1A/26)) failed Win32 error 87
>    1888 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(8000000,0 (1B/27)) failed Win32 error 87
>    1971 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(10000000,0 (1C/28)) failed Win32 error 87
>    2069 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(20000000,0 (1D/29)) failed Win32 error 87
>    2154 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(40000000,0 (1E/30)) failed Win32 error 87
>    2247 [main] cat 10068 format_proc_cpuinfo: SetThreadGroupAffinity(80000000,0 (1F/31)) failed Win32 error 87
> --8<---------------cut here---------------end--------------->8---
>
> It turns out this is related to processor groups and some changes that
> probably weren't even in the making when the Cygwin code was written.
> These changes were opt-in patches until 2008R2, but are now the default
> in 2016:
>
> https://blogs.msdn.microsoft.com/saponsqlserver/2011/10/08/uneven-windows-processor-groups/
>
> The BIOS on that server does something rather peculiar (it does make
> sense in a way, but Cygwin clearly didn't expect it):
>
> https://support.hpe.com/hpsc/doc/public/display?sp4ts.oid=7271227&docId=emr_na-c04650594&docLocale=en_US
>
> This results in Windows coming up with two 64 core processor groups that
> have 16 active logical processors each:
>
> --8<---------------cut here---------------start------------->8---
> (gdb) print plpi
> $1 = (PSYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX) 0x600008020
> (gdb) print *plpi
> $2 = {Relationship = RelationGroup, Size = 128, {Processor = {Flags = 2 '\002', Reserved = "\000\002", '\000' <repeats 18 times>, GroupCount = 0,
>       GroupMask = {{Mask = 4160, Group = 0, Reserved = {0, 0, 0}}}}, NumaNode = {NodeNumber = 131074, Reserved = '\000' <repeats 19 times>,
>       GroupMask = {Mask = 4160, Group = 0, Reserved = {0, 0, 0}}}, Cache = {Level = 2 '\002', Associativity = 0 '\000', LineSize = 2, CacheSize = 0,
>       Type = CacheUnified, Reserved = '\000' <repeats 12 times>, "@\020\000\000\000\000\000", GroupMask = {Mask = 0, Group = 0, Reserved = {0, 0,
>           0}}}, Group = {MaximumGroupCount = 2, ActiveGroupCount = 2, Reserved = '\000' <repeats 19 times>, GroupInfo = {{
>           MaximumProcessorCount = 64 '@', ActiveProcessorCount = 16 '\020', Reserved = '\000' <repeats 37 times>, ActiveProcessorMask = 65535}}}}}
> --8<---------------cut here---------------end--------------->8---
>
> I've confirmed that the error message is not printed if I manually
> correct the information for processor ID 17 as follows:
>
> --8<---------------cut here---------------start------------->8---
> (gdb) print affinity                                                                                                                                                            
> $2 = {Mask = 131072, Group = 0, Reserved = {0, 0, 0}}                                                                                                                          
> (gdb) set affinity.Mask=2                                                                                                                                                      
> (gdb) set affinity.Group=1
> --8<---------------cut here---------------end--------------->8---
>
> However, the same or possibly even stranger processor group setups can
> be created using boot options that force different organizations of
> processor groups.  There is an option to force a seperate processor
> group for each NUMA node and another one to force a specific number of
> groups.  The upshot is that even the first processor groups may not have
> the maximum number of processors present, so you need to check the
> number of active processors instead.  I couldn't find out if the
> processor mask is still guaranteed to be filled from the LSB contigously
> or whether one can rely on only the last group to have less than the
> first few.  It seems more prudent to check the group specific
> ActiveProcessorMask, although that significantly complicates the code.
> I don't think Windows can currently switch CPU online/offline when
> booted.
>
>
> As an aside, the cache size is reported as 256kiB (not just for this
> processor, but also for a Celeron 1037U on another machine), which seems
> to be the L2 cache for a single hardware core on these architectures.
> Linux now reports L3 cache sizes (and possibly L4 if present) for these
> (20MiB and 2MiB per socket respectively).
>
>
> Regards,
> Achim.
> --
> +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
>
> Factory and User Sound Singles for Waldorf Blofeld:
> http://Synth.Stromeko.net/Downloads.html#WaldorfSounds
--
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

signature.asc (849 bytes) Download Attachment