introduction
PowerShell's pipeline is slow. It's based on objects and that has significant overhead. As long as the number of objects going through the pipeline is small (or processing each item takes more than a few milliseconds), this is fine. However, when using executables like the
GNU Coreutils, it quickly becomes prohibitive. I've tried creating
Execute-Cmd, but it has quirks that I have not fully worked out.
performance analysis
A comparison of running
grep -P [0-9] numbers.txt | wc -l in PS vs. cmd.exe, running on a 2.67 GHz Core2 Duo:
> 1..5 | % { .\test_perf.ps1 ([Math]::Pow(10, $_)) }
10 iterations
36 ms ( 0 lines / ms) grep in PS
31 ms ( 0 lines / ms) grep in cmd.exe
100 iterations
51 ms ( 2 lines / ms) grep in PS
32 ms ( 3 lines / ms) grep in cmd.exe
1000 iterations
264 ms ( 4 lines / ms) grep in PS
37 ms ( 27 lines / ms) grep in cmd.exe
10000 iterations
2415 ms ( 4 lines / ms) grep in PS
36 ms ( 281 lines / ms) grep in cmd.exe
100000 iterations
24257 ms ( 4 lines / ms) grep in PS
67 ms (1495 lines / ms) grep in cmd.exe
the code
save this to a script file and then invoke like above
param ($iterations = 10000)
function Test-Perf($desc, $code)
{
$sw = [System.Diagnostics.StopWatch]::StartNew()
& $code
Write-Output (
"{0,5:0} ms ({1,4:0} lines / ms) {2}" -f
$sw.Elapsed.TotalMilliseconds,
($iterations / $sw.Elapsed.TotalMilliseconds),
$desc
)
}
# build a file with a bunch of lines
$sb = New-Object System.Text.StringBuilder
for ($i = 0; $i -lt $iterations; $i++)
{
[void]$sb.AppendLine($i)
}
$sb.ToString() | Out-File -Encoding ASCII numbers.txt
Write-Output ""
Write-Output "$iterations iterations"
Write-Output ""
Test-Perf "grep in PS" { grep -P [0-9] numbers.txt | wc -l > $null }
Test-Perf "grep in cmd.exe" { cmd /c "grep -P [0-9] numbers.txt | wc -l > nul" }