While hanging out in the PowerShell forums, I came across a question that talked about going out to a Sharepoint site and finding all word documents and then scanning each document and fixing all of the hyperlinks that had spaces in it. While I didn’t provide the answer for connecting to Sharepoint, I was able to help the user out with opening up and fixing any hyperlinks with spaces.
This example word document has three hyperlinks in it, 2 are valid and 1 is using a link that has spaces in it.
The first step in this is to connect to the word document using the Word.Application COM object.
$word = New-Object -ComObject Word.Application $document = $word.documents.open("C:\users\administrator\desktop\TEST.docx")
Finding all of the hyperlinks is actually very simple using the hyperlinks property.
$document.Hyperlinks
We can tell from the image that the last hyperlink has some spaces in it that need to be updated. But there is a gotcha to this that I will show a little later on. But first, how am I going to fix the hyperlink? I could use regex or a replace method for the space, but that just seems like a little too much for something like this. Fortunately, we can use the System.URI class to make this conversion without any hassle.
([uri]"http://domain.com/This is a bad link").AbsoluteUri
Perfect! Now we can work on making the updates to the bad hyperlink or hyperlinks, if applicable.
$hyperlinks = @($document.Hyperlinks) $hyperlinks | ForEach { If ($_.Address -match "\s") { $newURI = ([uri]$_.address).AbsoluteUri Write-Verbose ("Updating {0} to {1}" -f $_.Address,$newURI) -Verbose $_.address = $newURI } } $document.save() $word.quit()
You will notice that I had to wrap the $document.hyperlinks in “@()” to make it an array. There is a quirk when working with COM objects in that even though you may have multiple outputs that resemble a collection, it does not behave like a collection in the way that you can iterate through each of the objects or even pull a specific item using array slicing. By forcing it into a collection of objects.
Now that I have finished this up, lets look at that hyperlink again.
All fixed! All seems great, however, the gotcha that I was talking about is that if you hover over the hyperlink, it still looks like it just has spaces in it.
Another interesting thing is that even when looking at the link via PowerShell, you will see that it doesn’t show the “%20” that you would expect to see and instead shows spaces still.
This is important to note when running this code as it will always attempt to “fix” the hyperlink. I am not completely sure why it doesn’t show up correctly even when viewed through PowerShell, but I would assume it is another quirk of working with the word COM object.
Hope that this helps out those that have come across this issue and wanted an automated solution to fix it!
Have you had problems accessing the Address property. When I printed out the Hyperlinks collection, I see valid Address values using:
Write-Host ($_ | Format-List | Out-String)
But when I Access the Address directly, I get a null reference using:
$_.Address
How could that be?
Pingback: Fix Spaces in Hyperlinks That Exist in a Word Document | JOSHUASCOTT.NET