Fix Spaces in Hyperlinks That Exist in a Word Document

While hanging out in the PowerShell forums, I came across a question that talked about going out to a Sharepoint site and finding all word documents and then scanning each document and fixing all of the hyperlinks that had spaces in it. While I didn’t provide the answer for connecting to Sharepoint, I was able to help the user out with opening up and fixing any hyperlinks with spaces.

This example word document has three hyperlinks in it, 2 are valid and 1 is using a link that has spaces in it.

image

The first step in this is to connect to the word document using the Word.Application COM object.

$word = New-Object -ComObject Word.Application
$document = $word.documents.open("C:\users\administrator\desktop\TEST.docx")

 

Finding all of the hyperlinks is actually very simple using the hyperlinks property.

$document.Hyperlinks

image

We can tell from the image that the last hyperlink has some spaces in it that need to be updated. But there is a gotcha to this that I will show a little later on. But first, how am I going to fix the hyperlink? I could use regex or a replace method for the space, but that just seems like a little too much for something like this. Fortunately, we can use the System.URI class to make this conversion without any hassle.

([uri]"http://domain.com/This is a bad link").AbsoluteUri

image

Perfect! Now we can work on making the updates to the bad hyperlink or hyperlinks, if applicable.

$hyperlinks = @($document.Hyperlinks) 
$hyperlinks | ForEach {
    If ($_.Address -match "\s") {
        $newURI = ([uri]$_.address).AbsoluteUri
        Write-Verbose ("Updating {0} to {1}" -f $_.Address,$newURI) -Verbose
        $_.address = $newURI
    }
}
$document.save()
$word.quit()

image

You will notice that I had to wrap the $document.hyperlinks in “@()” to make it an array. There is a quirk when working with COM objects in that even though you may have multiple outputs that resemble a collection, it does not behave like a collection in the way that you can iterate through each of the objects or even pull a specific item using array slicing. By forcing it into a collection of objects.

Now that I have finished this up, lets look at that hyperlink again.

image

All fixed! All seems great, however, the gotcha that I was talking about is that if you hover over the hyperlink, it still looks like it just has spaces in it.

image

Another interesting thing is that even when looking at the link via PowerShell, you will see that it doesn’t show the “%20” that you would expect to see and instead shows spaces still.

image

This is important to note when running this code as it will always attempt to “fix” the hyperlink. I am not completely sure why it doesn’t show up correctly even when viewed through PowerShell, but I would assume it is another quirk of working with the word COM object.

Hope that this helps out those that have come across this issue and wanted an automated solution to fix it!

About Boe Prox

Microsoft Cloud and Datacenter MVP working as a SQL DBA.
This entry was posted in powershell and tagged , , . Bookmark the permalink.

2 Responses to Fix Spaces in Hyperlinks That Exist in a Word Document

  1. Brian says:

    Have you had problems accessing the Address property. When I printed out the Hyperlinks collection, I see valid Address values using:
    Write-Host ($_ | Format-List | Out-String)

    But when I Access the Address directly, I get a null reference using:
    $_.Address

    How could that be?

  2. Pingback: Fix Spaces in Hyperlinks That Exist in a Word Document | JOSHUASCOTT.NET

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s