Split-Path Performance in PowerShell

While working on my previous article, I was trying to determine the best way to lay out all of the properties I wanted. While I was getting the full path of a file, I also wanted just the filename as well in my output to make it easier to read when displayed in the console. At first I thought the obvious answer was to use Split-Path with the –Leaf parameter to get the filename. But in testing I found that this caused a performance hit on my function. Given the performance difference is measured in milliseconds, it still stands as something I would like to fine tune to make it run faster.

Using the following example as the baseline, I am going to run a test against 1000 filenames 50 times and take the average time in milliseconds. Starting out with my Split-Path approach first, you will see the time in milliseconds that it took on average to complete.

Split-Path Example

1..50 | ForEach {
    (Measure-Command {
        1..1000 | ForEach {
            $File = "C:\users\Administrator\Desktop\100.jpg" 
            Split-Path -Path $File -Leaf
        }
    }).TotalMilliseconds
} | Measure-Object -Average

Count    : 50
Average  : 340.56
Sum      :
Maximum  :
Minimum  :
Property :

Substring Example

I looked at using the Substring() method which is available in a string object to split off the filename from the full path. To do this, I had to also make use of the LastIndexOf() method and supply the final backslash “\” to help in getting only the filename. The example below shows how this turned out.

1..50 | ForEach {
    (Measure-Command {
        1..1000 | ForEach {
            $File = "C:\users\Administrator\Desktop\100.jpg" 
            $File.Substring($File.LastIndexOf("\")+1)
        }
    }).TotalMilliseconds
} | Measure-Object -Average

Count    : 50
Average  : 54.897506
Sum      :
Maximum  :
Minimum  :
Property :

A pretty significant improvement in speed. Obviously, as you add more files into the equation, the amount of time will increase. I’ll demonstrate that later along with some other interesting things that I found.

But that is not all folks! There is yet another way to split off the filename from a path. This time it comes from the GetFileName() method from the IO.Path type. Lets give that a shot using the same process as before to see what happens.

1..50 | ForEach {
    (Measure-Command {
        1..1000 | ForEach {
            $File = "C:\users\Administrator\Desktop\100.jpg" 
            [io.path]::GetFileName($file)
        }
    }).TotalMilliseconds
} | Measure-Object -Average

Count    : 50
Average  : 51.22724
Sum      :
Maximum  :
Minimum  :
Property :

This is the fastest method yet to grab the filename from a path.  Almost 300 milliseconds faster than Split-Path and a little faster than the SubString() method.

Because I had some free time, I wrote a little script to really get an idea on the differences in speed between these 3 methods. The script name is SplitPathPerfDemo.ps1 and the source code is available at the end of the article.

This script has a couple of parameters:

  • MaxFiles
    • Simulates any number of files to perform the path splitting for a file name
  • Iterations
    • The number of times to perform this measurement against splitting the filename from a given path.
  • FilePath
    • The full path to a file.

The basics of the script are that it performs n iterations against n files and gets an average time in milliseconds for each type of split and displays the output on the screen. I can also specify a specific path (it doesn’t have to exists) so you can see the depth of the path as well.

First lets run it with the default values (100 iterations, 100 files, 46 characters in path).

\SplitPathPerfDemo.ps1 | 
    Sort Time_ms -Descending | 
        Format-Table

image

Just as we saw before, the slowest was Split-Path and the fastest was using the GetFileName() method of IO.Path.

Now lets increase the number of files to 1000 and see the results.

\SplitPathPerfDemo.ps1 -MaxFiles 1000 | 
    Sort Time_ms -Descending | 
        Format-Table

image

Again, Split-Path was the slowest with the Substring() method being the fastest this time.

Something that I wanted to see was what happens when the path depth increases. Would the time increase as well for each of the methods? Leaving the Iterations and MaxFiles at their default value of 100.  The character depth will be 196 characters.

.\SplitPathPerfDemo.ps1 -Maxfiles 1000 -FilePath "C:\Users\Administrator\Desktop\PowerShellScripts\Thisis
alonglongnamedfolderformetouse\AnotherThisisalonglongnamedfolderformetouse\YetAnotherThisisalonglongnamedfolderformetouse\Get-Constructor.ps
1" | Sort Time_ms -Descending | Format-Table

image

Compare that to the same defaults of the 46 character path.

image

Now the more interesting side of this. The actual time difference is more due to the backslashes “\” in the path than the actual number of characters in the path. The following example is with only 2 backslashes.

.\SplitPathPerfDemo.ps1 -Maxfiles 1000 -FilePath "C:\Users123456AdministratorDesktopPowerShellScriptsThisisalonglongnamedfolderformetouseAnotherThisisalonglongnamedfolderformetouseYetAnotherThisisalonglongnamedfolderformetouse\Get-Constructor.ps1" | Sort Time_ms -Descending | Format-Table

image

Now for one with more backslashes in it.

.\SplitPathPerfDemo.ps1 -Maxfiles 1000 -FilePath "C:\a\b\c\d\e\f\g\h\j\k\l\l\g\h\y\r\e\d\Get-Constructor.ps1" | Sort Time_ms -Descending | Format-Table

image

Only 46 characters in the file, but with all of those backslashes, it is almost as time consuming as the splitting of a path with 196 characters! With that it seems like the more backslashes that one has in a path, the greater the impact to the performance of splitting a path.

So with that, it would see that using IO.Path with the GetFileName() method to quickly get the filename from a path. Of course, if using Get-ChildItem, then you have no need for this method but for those special cases when you would need to perform multiple split operations, this might be something to keep in mind.

Source Code for SplitPathPerfDemo.ps1

[cmdletbinding()]
Param (
    [int]$MaxFiles = 100,
    [int]$Iterations = 100,
    [string]$FilePath = "C:\users\Administrator\Desktop\users\photo.jpg"
)
1..$Iterations | ForEach {
    (Measure-Command {
        1..$MaxFiles | ForEach {
            Split-Path -Path $FilePath -Leaf
        }
    }) | Select TotalMilliseconds, TotalSeconds
} | Measure-Object -Average -Property TotalMilliseconds | ForEach {
    New-Object PSObject -Property @{
        Test = 'Split-Path'
        Iterations = $Iterations
        FilesCount = $MaxFiles
        Time_ms = [math]::Round($_.Average,2)
        PathDepth = $FilePath.length
    }
}

1..$Iterations | ForEach {
    (Measure-Command {
        1..$MaxFiles | ForEach {
            $FilePath.Substring($FilePath.LastIndexOf("\")+1)
        }
    }) | Select TotalMilliseconds, TotalSeconds
} | Measure-Object -Average -Property TotalMilliseconds | ForEach {
    New-Object PSObject -Property @{
        Test = 'Substring()'
        Iterations = $Iterations
        FilesCount = $MaxFiles
        Time_ms = [math]::Round($_.Average,2)
        PathDepth = $FilePath.length
    }
}

1..$Iterations | ForEach {
    (Measure-Command {
        1..$MaxFiles | ForEach {
            [io.path]::GetFileName($FilePath)
        }
    }) | Select TotalMilliseconds, TotalSeconds
} | Measure-Object -Average -Property TotalMilliseconds | ForEach {
    New-Object PSObject -Property @{
        Test = 'io.path'
        Iterations = $Iterations
        FilesCount = $MaxFiles
        Time_ms = [math]::Round($_.Average,2)
        PathDepth = $FilePath.length
    }
}

About Boe Prox

Microsoft Cloud and Datacenter MVP working as a SQL DBA.
This entry was posted in powershell and tagged , , . Bookmark the permalink.

3 Responses to Split-Path Performance in PowerShell

  1. mjolinor says:

    Correction – splitting 1000 filenames 50 times took 96 ms, not 96 seconds.

  2. mjolinor says:

    You’ve got one more possibility to consider – regex. Using $file -replace ‘.+\\(.+)’,’$1′ on my system it benchmarks about the same as the substring method. But in cases where you have an array of fullnames, the -replace operator doesn’t require the foreach. Splitting the leaf off of an array of 1000 full paths 50 times took 96 seconds on my system. That’s less than 2ms / 1000 files.

    • Boe Prox says:

      Nice! I didn’t even consider using regex in my demos. Also like the idea of of collecting an array of items and using the -replace. Great stuff Rob!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s