Quick Hits: Speed Up Some of your Commands by Avoiding the Pipeline

Sometimes running a command using the pipeline can take a while depending on the amount of data that is being processed. For the easiest example, I will make use of Get-Random to demonstrate the differences in speed and how much memory is used.

("Memory Before: {0:#,#}KB" -f ((Get-Process -Id $PID).PeakWorkingSet /1kb))
(Measure-Command {
    1..1E6 | Get-Random
}).TotalSeconds
("Memory After: {0:#,#}KB" -f ((Get-Process -Id $PID).PeakWorkingSet /1kb))

image

Here you see that it takes a little over a minute to run, but the amount of memory used isn’t a whole lot; about 6KB.

Instead, you should take a look at using the –InputObject parameter as it accepts a collection of objects to use instead for much faster performance.

("Memory Before: {0:#,#}KB" -f ((Get-Process -Id $PID).PeakWorkingSet /1kb))
(Measure-Command {
    Get-Random -InputObject (1..1E6)
}).TotalSeconds
("Memory After: {0:#,#}KB" -f ((Get-Process -Id $PID).PeakWorkingSet /1kb))

image

This didn’t event take a second to run using the InputObject parameter. However, you will see that the amount of memory required to perform this operation jumped up by about 20KB, so caution should be used if trying to run this against a lot of data to avoid running into memory issues.

Of course, there is a catch to this performance increase in that it will consume memory more than using the pipeline. So be sure to keep this in mind when using the InputObject parameter.

This is especially true when working with large logs. While cmdlets such as Get-Content and Import-CSV do not have InputObject parameters, it is better to use the pipeline to handle the amount of data that is being returned without throwing an Out of Memory exception. Assume that we have a 600MB+ CSV file and we want to know something simple such as how many lines are in this CSV, the following has the potential to throw the OOM error (this actually happened to me at work).

(Import-Csv -File log.csv).Count

In order to avoid this, I made use of the Measure-Object cmdlet by piping the output of Export-CSV into the Measure-Object to get the count.

Export-Csv -File test.csv | Measure-Object 

Of course, these are very generic examples, but hopefully you can see the benefit of using each of these methods in your scripts. So the bottom line is: use the pipeline to avoid issues with memory at the cost of slower performance and avoid using the pipeline if you want better performance from your code at the expense of using more memory.

This entry was posted in powershell and tagged , , , , . Bookmark the permalink.

1 Response to Quick Hits: Speed Up Some of your Commands by Avoiding the Pipeline

  1. Hi Box!
    Again a very interesting Post !
    Thank you!
    Did you now the discussion on ForEach loop versus the ForEach-Object cmdlet ?
    http://bsonposh.com/archives/327
    You have to read the comments there!

Leave a comment