Optimising cygwin fork performance

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Optimising cygwin fork performance

cygwin-apps mailing list
Hi,

So I know it's been mentioned a lot that fork is slow on Cygwin, but
compared to other people's machines, eg when building, it seems way
slower for me.

First I'd like to know if there's a good way to measure this that anyone
has found, because I'm not sure how to measure it. If I print multiple
lines with echo in a script, I can see it printing maybe 2-3 a second -
it's very slow.

I think this might be because I'm using a Virtual Machine with
VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin).

It might be nice if we could so some comparisons so I can figure out
what's wrong.

Hamish


0x87B761FE07F548D6.asc (3K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Optimising cygwin fork performance

cygwin-apps mailing list


On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote:

> Hi,
>
> So I know it's been mentioned a lot that fork is slow on Cygwin, but
> compared to other people's machines, eg when building, it seems way
> slower for me.
>
> First I'd like to know if there's a good way to measure this that anyone
> has found, because I'm not sure how to measure it. If I print multiple
> lines with echo in a script, I can see it printing maybe 2-3 a second -
> it's very slow.
>
> I think this might be because I'm using a Virtual Machine with
> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
> with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin).
>
> It might be nice if we could so some comparisons so I can figure out
> what's wrong.
>
> Hamish

Same AV here, W10 64bit (no VM), 2 year old Laptop

model name      : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
4 cores

https://github.com/mondalaci/fork-benchmark
it seems there is a aging effect

$ ./fork-benchmark.exe 1000
Forked, executed and destroyed 1000 processes in 39.928576 seconds.

$ ./fork-benchmark.exe 1000
Forked, executed and destroyed 1000 processes in 42.701295 seconds.

$ ./fork-benchmark.exe 1000
Forked, executed and destroyed 1000 processes in 49.890909 seconds.

$ ./fork-benchmark.exe 1000
Forked, executed and destroyed 1000 processes in 61.657031 seconds.


Reply | Threaded
Open this post in threaded view
|

Re: Optimising cygwin fork performance

Christian Franke
Marco Atzeri via Cygwin-apps wrote:

>
> On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote:
>> Hi,
>>
>> So I know it's been mentioned a lot that fork is slow on Cygwin, but
>> compared to other people's machines, eg when building, it seems way
>> slower for me.
>>
>> First I'd like to know if there's a good way to measure this that anyone
>> has found, because I'm not sure how to measure it. If I print multiple
>> lines with echo in a script, I can see it printing maybe 2-3 a second -
>> it's very slow.
>>
>> I think this might be because I'm using a Virtual Machine with
>> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
>> with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin).
>>
>> It might be nice if we could so some comparisons so I can figure out
>> what's wrong.
>>
>> Hamish
>
> Same AV here, W10 64bit (no VM), 2 year old Laptop
>
> model name      : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
> 4 cores
>
> https://github.com/mondalaci/fork-benchmark
> it seems there is a aging effect

Result of power management? Could possibly be prevented if another minor
cpu load is run in parallel.


>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 39.928576 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 42.701295 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 49.890909 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 61.657031 seconds.
>
>


9 year old PC, W10 64bit, no VM, Intel i7-2600K CPU @ 3.4GHz, 4 cores /
8 threads

AV: Windows Defender
Cygwin x86_64

Protection on:

$ ./fork-benchmark-64 1000
Forked, executed and destroyed 1000 processes in 16.651101 seconds.

$ ./fork-benchmark-64 1000
Forked, executed and destroyed 1000 processes in 16.674107 seconds.

Protection off:

$ ./fork-benchmark-64 1000
Forked, executed and destroyed 1000 processes in 14.281071 seconds.

$ ./fork-benchmark-64 1000
Forked, executed and destroyed 1000 processes in 14.426482 seconds.


An alternative which could be run out-of-the-box is the good old
'date(s) per second' benchmark. Its results are comparable:

$ while true; do date; done | uniq -c
...
      53 Wed Dec 16 19:23:26 CET 2020   <== Protection on
      54 Wed Dec 16 19:23:27 CET 2020
      56 Wed Dec 16 19:23:28 CET 2020
      55 Wed Dec 16 19:23:29 CET 2020
      56 Wed Dec 16 19:23:30 CET 2020
...
      52 Wed Dec 16 19:23:37 CET 2020
      54 Wed Dec 16 19:23:38 CET 2020
      56 Wed Dec 16 19:23:39 CET 2020
      63 Wed Dec 16 19:23:40 CET 2020   <== Protection off
      63 Wed Dec 16 19:23:41 CET 2020
      62 Wed Dec 16 19:23:42 CET 2020
      64 Wed Dec 16 19:23:43 CET 2020
...
      63 Wed Dec 16 19:23:51 CET 2020
      64 Wed Dec 16 19:23:52 CET 2020
      63 Wed Dec 16 19:23:53 CET 2020
      55 Wed Dec 16 19:23:54 CET 2020   <== Protection on
      48 Wed Dec 16 19:23:55 CET 2020
      53 Wed Dec 16 19:23:56 CET 2020
      54 Wed Dec 16 19:23:57 CET 2020
      54 Wed Dec 16 19:23:58 CET 2020
...


Cygwin x86 is somewhat slower:

Protection on:

$ ./fork-benchmark-32.exe 1000
Forked, executed and destroyed 1000 processes in 19.231766 seconds.

Protection off:

$ ./fork-benchmark-32.exe 1000
Forked, executed and destroyed 1000 processes in 17.107739 seconds.

Regards,
Christian

Reply | Threaded
Open this post in threaded view
|

Re: Optimising cygwin fork performance

Brian Inglis
In reply to this post by cygwin-apps mailing list
On 2020-12-16 10:36, Marco Atzeri via Cygwin-apps wrote:

> On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote:
>> So I know it's been mentioned a lot that fork is slow on Cygwin, but
>> compared to other people's machines, eg when building, it seems way
>> slower for me.
>>
>> First I'd like to know if there's a good way to measure this that anyone
>> has found, because I'm not sure how to measure it. If I print multiple
>> lines with echo in a script, I can see it printing maybe 2-3 a second -
>> it's very slow.
>>
>> I think this might be because I'm using a Virtual Machine with
>> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
>> with exceptions for the cygwin install folders (C:\cygwin64, C:\cygwin).
>>
>> It might be nice if we could so some comparisons so I can figure out
>> what's wrong.
Running strace on your forking executable should give you accurate numbers in
the output, with some work to extract the relevant values.

> Same AV here, W10 64bit (no VM), 2 year old Laptop
>
> model name      : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
> 4 cores
>
> https://github.com/mondalaci/fork-benchmark
> it seems there is a aging effect
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 39.928576 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 42.701295 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 49.890909 seconds.
>
> $ ./fork-benchmark.exe 1000
> Forked, executed and destroyed 1000 processes in 61.657031 seconds.
It's all part of your current process tree.
Running cygserver may help this with appropriate process settings:

$ grep -C1 'kern\.srv\..*_.*' /etc/cygserver.conf

# kern.srv.cleanup_threads: No. of cygserver threads used for cleanup tasks.
# Default: 2, Min: 1, Max: 16, command line option -c, --cleanup-threads
#kern.srv.cleanup_threads 2
kern.srv.cleanup_threads 16

# kern.srv.request_threads: No. of cygserver threads used to serve
#                           application requests.
# Default: 10, Min: 1, Max: 310, command line option -r, --request-threads
#kern.srv.request_threads 10
kern.srv.request_threads 310

# kern.srv.process_cache_size: No. of concurrent processes which can be handled
#                              by Cygserver concurrently.
# Default: 62, Min: 1, Max: 310, command line option -p, --process-cache
#kern.srv.process_cache_size 62
kern.srv.process_cache_size 310

$ ./fork-benchmark 1000
Forked, executed and destroyed 1000 processes in 33.397727 seconds.
$ ./fork-benchmark 1000
Forked, executed and destroyed 1000 processes in 34.70389 seconds.
$ ./fork-benchmark 1000
Forked, executed and destroyed 1000 processes in 34.186709 seconds.
$ ./fork-benchmark 1000
Forked, executed and destroyed 1000 processes in 33.65649 seconds.
$ sed -En '/^model name|^cpu MHz/p;/MHz/q' /proc/cpuinfo
model name      : AMD A10-9700 RADEON R7, 10 COMPUTE CORES 4C+6G
cpu MHz         : 3500.000

$ strace -o fork-benchmark.strace ./fork-benchmark 10
$ egrep '^--- Process [0-9]+ (crea|\(pid: [0-9]+\) exi)ted|dwProcessId
[0-9]+|ExitProcess n 0x' fork-benchmark.strace > fork-benchmark.log
$ awk '/ [0-9]+! pinfo::exit: /{t+=$2};END{print t/10000"ms"}' fork-benchmark.log
34.1855ms

Faster CPUs, faster memory, bigger caches, SSD drive may help.

--
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]

fork-benchmark.log (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Optimising cygwin fork performance

cygwin-apps mailing list
On 16/12/2020 20:37, Brian Inglis wrote:

> On 2020-12-16 10:36, Marco Atzeri via Cygwin-apps wrote:
>> On 16.12.2020 13:13, Hamish McIntyre-Bhatty via Cygwin-apps wrote:
>>> So I know it's been mentioned a lot that fork is slow on Cygwin, but
>>> compared to other people's machines, eg when building, it seems way
>>> slower for me.
>>>
>>> First I'd like to know if there's a good way to measure this that
>>> anyone
>>> has found, because I'm not sure how to measure it. If I print multiple
>>> lines with echo in a script, I can see it printing maybe 2-3 a second -
>>> it's very slow.
>>>
>>> I think this might be because I'm using a Virtual Machine with
>>> VirtualBox, and QEMU/KVM might be quicker. I'm using Avira Antivurus,
>>> with exceptions for the cygwin install folders (C:\cygwin64,
>>> C:\cygwin).
>>>
>>> It might be nice if we could so some comparisons so I can figure out
>>> what's wrong.
>
> Running strace on your forking executable should give you accurate
> numbers in the output, with some work to extract the relevant values.
>
>> Same AV here, W10 64bit (no VM), 2 year old Laptop
>>
>> model name      : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
>> 4 cores
>>
>> https://github.com/mondalaci/fork-benchmark
>> it seems there is a aging effect
>>
>> $ ./fork-benchmark.exe 1000
>> Forked, executed and destroyed 1000 processes in 39.928576 seconds.
>>
>> $ ./fork-benchmark.exe 1000
>> Forked, executed and destroyed 1000 processes in 42.701295 seconds.
>>
>> $ ./fork-benchmark.exe 1000
>> Forked, executed and destroyed 1000 processes in 49.890909 seconds.
>>
>> $ ./fork-benchmark.exe 1000
>> Forked, executed and destroyed 1000 processes in 61.657031 seconds.
>
> It's all part of your current process tree.
> Running cygserver may help this with appropriate process settings:
>
> $ grep -C1 'kern\.srv\..*_.*' /etc/cygserver.conf
>
> # kern.srv.cleanup_threads: No. of cygserver threads used for cleanup
> tasks.
> # Default: 2, Min: 1, Max: 16, command line option -c, --cleanup-threads
> #kern.srv.cleanup_threads 2
> kern.srv.cleanup_threads 16
>
> # kern.srv.request_threads: No. of cygserver threads used to serve
> #                           application requests.
> # Default: 10, Min: 1, Max: 310, command line option -r,
> --request-threads
> #kern.srv.request_threads 10
> kern.srv.request_threads 310
>
> # kern.srv.process_cache_size: No. of concurrent processes which can
> be handled
> #                              by Cygserver concurrently.
> # Default: 62, Min: 1, Max: 310, command line option -p, --process-cache
> #kern.srv.process_cache_size 62
> kern.srv.process_cache_size 310
>
> $ ./fork-benchmark 1000
> Forked, executed and destroyed 1000 processes in 33.397727 seconds.
> $ ./fork-benchmark 1000
> Forked, executed and destroyed 1000 processes in 34.70389 seconds.
> $ ./fork-benchmark 1000
> Forked, executed and destroyed 1000 processes in 34.186709 seconds.
> $ ./fork-benchmark 1000
> Forked, executed and destroyed 1000 processes in 33.65649 seconds.
> $ sed -En '/^model name|^cpu MHz/p;/MHz/q' /proc/cpuinfo
> model name      : AMD A10-9700 RADEON R7, 10 COMPUTE CORES 4C+6G
> cpu MHz         : 3500.000
>
> $ strace -o fork-benchmark.strace ./fork-benchmark 10
> $ egrep '^--- Process [0-9]+ (crea|\(pid: [0-9]+\) exi)ted|dwProcessId
> [0-9]+|ExitProcess n 0x' fork-benchmark.strace > fork-benchmark.log
> $ awk '/ [0-9]+! pinfo::exit: /{t+=$2};END{print t/10000"ms"}'
> fork-benchmark.log
> 34.1855ms
>
> Faster CPUs, faster memory, bigger caches, SSD drive may help.
Just running in KVM seems to have helped me.

My times are more like 13 seconds for 1000 forks now. There appears to
be drop-off, but after a big package build and some idling, it then gets
back down to 13 seconds.

Is there a way for me to check that the cygserver options are working?

Either way, my package builds are about twice as quick as they were
before now, possibly just from a reinstall and using KVM.

Hamish


0x87B761FE07F548D6.asc (3K) Download Attachment
signature.asc (849 bytes) Download Attachment