Using PowerShell to Query Web Site Information

In my last post, I showed you how to use the Test-Connection cmdlet to test your internet access.  In this post, I can show you how to use Net.WebRequest or the Net.WebClient to send a request to website and verify if you have internet access or not. There is more you can do with these other than just testing Internet access or testing that a website is currently up and running, but that is beyond the scope of that I am going to show you here.

Net.WebClient Class

The first class I will show is the Net.WebClient class to access a web page.  Using this class and its associated methods will actually download the source page of the website.  If you also look at the methods, you can see methods for downloading files, download data, uploading data and files. Again, this is more than what I will go into at this time, but still, it does provide some pretty cool things to do with this class.

First off, lets create the object using the Net.WebClient class and view those methods.

$web = New-Object Net.WebClient
$web | Get-Member

Capture

As you can see, there are a lot of methods and even some events that you can leverage when using this class. For the sake of this post, I will be going into using the DownloadString() method to use in testing a connection to a web site.

Using this, I will use the DownloadString() method to download the webpage and display it in a powershell console. Keep in mind this will be displayed in a pretty un-user friendly format, but that is ok because we are only concerned whether we can actually access the site or not.

$web.DownloadString("http://www.bing.com")

Capture

Holy cow! That is a lot of wild output, isn’t it? As you can see, the entire bing.com webpage has been downloaded and is now being displayed in the PowerShell console.  Just for the sake of something a little bit smaller, I will run this against microsoft.com as well to show.

Capture

Ok, that’s better. Again, just be seeing this information shows us that we are able to connect to the internet, and not only that we are able to access the webpage without any issues.

So we know what happens when a site is active or our internet connection is active, but what happens if the connection is down?

Try {
	$web.DownloadString("https://hereisasite.net")
	} 
Catch {
	Write-Warning "$($error[0])"
	}

Capture

Knowing that you can use Try/Catch when making the web site connection, you can then use looping (Do Until, Do While, While) to continue to attempt a connection until it is able to do so, and then send some sort of notification when successful.

Begin {
    $web = New-Object System.Net.WebClient
    $flag = $false
    }
Process {
    While ($flag -eq $false) {
        Try {
            $web.DownloadString("https://hereisasite.net")
            $flag = $True
            }
        Catch {
            Write-host -fore Red -nonewline "Access down..."
            }
        }
    }    
End {
    Write-Host -fore Green "Access is back"
    }

Capture

Had I used a legit site and killed my connection and then brought it back online, you would have seen the green text stating the access was back.  However, I was listening to the PowerScripting Podcast and didn’t want to miss out on listening to it.

One last item on using this class for your connecting and verifying needs. This doesn’t really apply to testing the connection but is still pretty cool. This involves shows how many bytes a the webpage that you are downloading is.

Let’s re-look at bing.com and find out how large their page is…

"{0} bytes" -f ($web.DownloadString("http://bing.com")).length.toString("###,###,##0")

Capture

As you can see, it is roughly 27KB in size for bing.com. Again, pretty cool if you wanted to know just how big a page is that you are downloading to your browser.

Net.WebRequest

Using this class, I will show you how to get a response from a web server hosting a site. Unlike the Net.WebClient class, you will not be able to download the site page. However, with this class, you can get a response code and also see what type of web server is being used to host the site.

First we will create the object that we can then use to get a response back from the web site.  Since this class does not have a constructor we can use like the previous class, we will use a static method available from this class to create the object. Using the Create method also requires that we input the web site name as well.

$webRequest = [net.WebRequest]::Create("http://microsoft.com")
$webRequest | gm

Capture

There are a few methods here to use, but the main one that we will use is GetResponse(). Using this method will return back data about the website such as the type of web server being used to host the site, the status code, description and even the size of the webpage, much like I should you could do using the Net.WebClient class and converting the output to a string and getting the length.

$webrequest.GetResponse()

Calling the GetResponse() method shows you the following:

IsMutuallyAuthenticated : False
Cookies                 : {}
Headers                 : {VTag, Accept-Ranges, Content-Length, Cache-Control...}
ContentLength           : 1020
ContentEncoding         :
ContentType             : text/html
CharacterSet            : ISO-8859-1
Server                  : Microsoft-IIS/7.5
LastModified            : 3/16/2009 3:35:26 PM
StatusCode              : OK
StatusDescription       : OK
ProtocolVersion         : 1.1
ResponseUri             :
http://www.microsoft.com/
Method                  : GET
IsFromCache             : False

I am not much of a web person, from some of the information given, I can see that microsoft.com homepage is approximately 1KB in size and that the web server hosting this page is IIS 7.5.  One thing to note is that the StatusCode says OK.  If you look at the type of object this is, you will see that it is a System.Net.HttpStatusCode object. Clicking on the link, you can see that there many members of this class that relate to different codes that a web site may return.

You can convert this object into an integer to get the code by casting it as an integer.

(($webRequest.GetResponse()).Statuscode) -as [int]

PS C:\Users\boe> (($webRequest.GetResponse()).Statuscode) -as [int]
200

I would recommend doing this method as the StatusDescription matches the StatusCode and there is no need to have the exact same output and they can compliment each other.

Assuming my network connection was down or the site itself was down, you could use a Try/Catch statement to catch the error, and in turn like using the Net.WebClient class, perform a loop to monitor that site.

Capture

I have written a couple of advanced functions that utilize both of these .Net classes:

Get-Website  which is a wrapper for Net.WebClient

Test-Website which is a wrapper for Net.WebRequest

Code

Get-WebPage:

Script Repository

CodePlex

Get-WebSite:

Script Repository

CodePlex

function Get-WebPage {
<#  
.SYNOPSIS  
   Downloads web page from site.
.DESCRIPTION
   Downloads web page from site and displays source code or displays total bytes of webpage downloaded
.PARAMETER Url
    URL of the website to test access to.
.PARAMETER UseDefaultCredentials
    Use the currently authenticated user's credentials  
.PARAMETER Proxy
    Used to connect via a proxy
.PARAMETER Credential
    Provide alternate credentials 
.PARAMETER ShowSize
    Displays the size of the downloaded page in bytes                 
.NOTES  
    Name: Get-WebPage
    Author: Boe Prox
    DateCreated: 08Feb2011        
.EXAMPLE  
    Get-WebPage -url "http://www.bing.com"
    
Description
------------
Returns information about Bing.Com to include StatusCode and type of web server being used to host the site.

#> 
[cmdletbinding(
	DefaultParameterSetName = 'url',
	ConfirmImpact = 'low'
)]
    Param(
        [Parameter(
            Mandatory = $True,
            Position = 0,
            ParameterSetName = '',
            ValueFromPipeline = $True)]
            [string][ValidatePattern("^(http|https)\://*")]$Url,
        [Parameter(
            Position = 1,
            Mandatory = $False,
            ParameterSetName = 'defaultcred')]
            [switch]$UseDefaultCredentials,
        [Parameter(
            Mandatory = $False,
            ParameterSetName = '')]
            [string]$Proxy,
        [Parameter(
            Mandatory = $False,
            ParameterSetName = 'altcred')]
            [switch]$Credential,
        [Parameter(
            Mandatory = $False,
            ParameterSetName = '')]
            [switch]$ShowSize                        
                        
        )
Begin {     
    $psBoundParameters.GetEnumerator() | % { 
        Write-Verbose "Parameter: $_" 
        }
   
    #Create the initial WebClient object
    Write-Verbose "Creating web client object"
    $wc = New-Object Net.WebClient 
    
    #Use Proxy address if specified
    If ($PSBoundParameters.ContainsKey('Proxy')) {
        #Create Proxy Address for Web Request
        Write-Verbose "Creating proxy address and adding into Web Request"
        $wc.Proxy = New-Object -TypeName Net.WebProxy($proxy,$True)
        }       
    
    #Determine if using Default Credentials
    If ($PSBoundParameters.ContainsKey('UseDefaultCredentials')) {
        #Set to True, otherwise remains False
        Write-Verbose "Using Default Credentials"
        $wc.UseDefaultCredentials = $True
        }
    #Determine if using Alternate Credentials
    If ($PSBoundParameters.ContainsKey('Credentials')) {
        #Prompt for alternate credentals
        Write-Verbose "Prompt for alternate credentials"
        $wc.Credential = (Get-Credential).GetNetworkCredential()
        }         
        
    }
Process {    
    Try {
        If ($ShowSize) {
            #Get the size of the webpage
            Write-Verbose "Downloading web page and determining size"
            "{0:N0}" -f ($wr.DownloadString($url) | Out-String).length -as [INT]
            }
        Else {
            #Get the contents of the webpage
            Write-Verbose "Downloading web page and displaying source code" 
            $wc.DownloadString($url)       
            }
        
        }
    Catch {
        Write-Warning "$($Error[0])"
        }
    }   
}  

 

function Get-WebSite {
<#  
.SYNOPSIS  
    Retrieves information about a website.
.DESCRIPTION
    Retrieves information about a website.
.PARAMETER Url
    URL of the website to test access to.
.PARAMETER UseDefaultCredentials
    Use the currently authenticated user's credentials  
.PARAMETER Proxy
    Used to connect via a proxy
.PARAMETER TimeOut
    Timeout to connect to site, in milliseconds
.PARAMETER Credential
    Provide alternate credentials              
.NOTES  
    Name: Get-WebSite
    Author: Boe Prox
    DateCreated: 08Feb2011        
.EXAMPLE  
    Get-WebSite -url "http://www.bing.com"
    
Description
------------
Returns information about Bing.Com to include StatusCode and type of web server being used to host the site.

#> 
[cmdletbinding(
	DefaultParameterSetName = 'url',
	ConfirmImpact = 'low'
)]
    Param(
        [Parameter(
            Mandatory = $True,
            Position = 0,
            ParameterSetName = '',
            ValueFromPipeline = $True)]
            [string][ValidatePattern("^(http|https)\://*")]$Url,
        [Parameter(
            Position = 1,
            Mandatory = $False,
            ParameterSetName = 'defaultcred')]
            [switch]$UseDefaultCredentials,
        [Parameter(
            Mandatory = $False,
            ParameterSetName = '')]
            [string]$Proxy,
        [Parameter(
            Mandatory = $False,
            ParameterSetName = '')]
            [Int]$Timeout,
        [Parameter(
            Mandatory = $False,
            ParameterSetName = 'altcred')]
            [switch]$Credential            
                        
        )
Begin {     
    $psBoundParameters.GetEnumerator() | % { 
        Write-Verbose "Parameter: $_" 
        }
   
    #Create the initial WebRequest object using the given url
    Write-Verbose "Creating the web request object"        
    $webRequest = [net.WebRequest]::Create($url)
    
    #Use Proxy address if specified
    If ($PSBoundParameters.ContainsKey('Proxy')) {
        #Create Proxy Address for Web Request
        Write-Verbose "Creating proxy address and adding into Web Request"
        $webRequest.Proxy = New-Object -TypeName Net.WebProxy($proxy,$True)
        }
        
    #Set timeout
    If ($PSBoundParameters.ContainsKey('TimeOut')) {
        #Setting the timeout on web request
        Write-Verbose "Setting the timeout on web request"
        $webRequest.Timeout = $timeout
        }        
    
    #Determine if using Default Credentials
    If ($PSBoundParameters.ContainsKey('UseDefaultCredentials')) {
        #Set to True, otherwise remains False
        Write-Verbose "Using Default Credentials"
        $webrequest.UseDefaultCredentials = $True
        }
    #Determine if using Alternate Credentials
    If ($PSBoundParameters.ContainsKey('Credentials')) {
        #Prompt for alternate credentals
        Write-Verbose "Prompt for alternate credentials"
        $wc.Credential = (Get-Credential).GetNetworkCredential()
        }            
        
    #Set TimeStamp prior to attempting connection    
    $then = get-date
    }
Process {    
    Try {
        #Make connection to gather response from site
        $response = $webRequest.GetResponse()
        #If successful, get the date for comparison
        $now = get-date 
        
        #Generate report
        Write-Verbose "Generating report from website connection and response"  
        $report = @{
            URL = $url
            StatusCode = $response.Statuscode -as [int]
            StatusDescription = $response.StatusDescription
            ResponseTime = "$(($now - $then).totalseconds)"
            WebServer = $response.Server
            Size = $response.contentlength
            } 
        }
    Catch {
        #Get timestamp of failed attempt
        $now = get-date
        #Put the current error into a variable for later use
        $errorstring = "$($error[0])"
        
        #Generate report
        $report = @{
            URL = $url
            StatusCode = ([regex]::Match($errorstring,"\b\d{3}\b")).value
            StatusDescription = (($errorstring.split('\)')[2]).split('.\')[0]).Trim()
            ResponseTime = "$(($now - $then).totalseconds)" 
            WebServer = $response.Server
            Size = $response.contentlength
            }   
        }
    }
End {        
    #Display Report    
    New-Object PSObject -property $report  
    }    
}  
About these ads

About Boe Prox

Microsoft PowerShell MVP working as a Senior Systems Administrator
This entry was posted in powershell, scripts and tagged , , . Bookmark the permalink.

26 Responses to Using PowerShell to Query Web Site Information

  1. Pingback: The Devil At Work » Download-Skript in Powershell

  2. Excellent info. $webrequest.GetResponse() is what I needed to track down where a 301 redirect was pointing to. Thanks!

  3. Pingback: SQLCMD and PowerShell: Generate 450 Million Names With 20 Lines of Code - SQL Server Appendix Z - Site Home - MSDN Blogs

  4. Pingback: PowerShell – Essential Posts | FICILITY.NET

  5. Mark says:

    is there any way to send request from different static IP address to a any website? My website is part of a load balance and has a bunch of different ip addresses,

    • Ron V says:

      This is what i’ve been doing to test separate nodes of a load balanced website. I haven’t been able to figure out how to authenticate the request, but it works for anonymous websites. If anyone can help me with authenticating the connection (similar to -usedefaultcredentials in the invoke-webrequest commandlet), or just a better solution in general i would greatly appreciate it.

      #—i am only posting required parts of my script to make it easier to post, no guarantee that it won’t need some minor bug fixes.—-

      $ip = “166.123.12.2”
      $Uri = new-object system.uri(“http://www.google.com”)
      [string]$request = “GET $($Uri.AbsolutePath)” + ” HTTP/1.1″ + [System.Environment]::NewLine
      $request += “Host: $($Uri.host)” + [System.Environment]::NewLine + [System.Environment]::NewLine

      [System.Net.Sockets.TcpClient]$tcpclnt = New-Object System.Net.Sockets.TcpClient($ip, $Uri.port)
      $tcpclnt.ReceiveBufferSize = 102400
      $stream = $tcpclnt.GetStream()

      $binaryMessage = [System.Text.ASCIIEncoding]::ASCII.GetBytes($request)
      $stream.Write($binaryMessage, 0, $binaryMessage.Length)
      [int]$clientLength = $tcpclnt.ReceiveBufferSize
      $receiveMessageBytes = New-Object byte[] $tcpclnt.ReceiveBufferSize
      $count = $stream.Read($receiveMessageBytes, 0, $tcpclnt.ReceiveBufferSize)
      $stream.Close()
      $tcpclnt.Close()
      [string]$receiveMessage = [System.Text.ASCIIEncoding]::ASCII.GetString($receiveMessageBytes)
      $receiveMessageBytes = $null
      $receiveMessage = $receiveMessage.Substring(0, $count)
      ——-

  6. Lokiarmos says:

    Hi, Niffty bit of code, i did find one bug though in the Get-WebPage block,

    In the process block you have the following line of code for getting the lenght, “{0:N0}” -f ($wr.DownloadString($url) | Out-String).length -as [INT]

    The Problem is that the object $wr does not exist, it should be $wc.

    otherwise a neat little bit of code.

  7. Jeff says:

    Good article. Is there any option to send a certificate for sites that require the user’s PKI certificate for authentication?

    • Boe Prox says:

      I found this out on Stack Overflow that uses some inline C# code to allow a certificate to be added to the web request. Not sure if it is exactly what you are looking for. I will do some more research to see if there are other options.

  8. Pingback: Clearing PeopleSoft Cache Using PowerShell Remoting (Part 3) | Other Duties As Required

  9. shreyas says:

    I am very new to powershell…….Can someone help me how to write a powershell script to find out a particular content in a webpage…… say i am looking for a “” tag for example.

  10. Sam says:

    I was struggling in capturing specific string value that I search from page source. Above explained code helped me in modifying my code and fetch what I want,
    Good piece of cake :)

  11. Pingback: More PowerShell V3 Goodness | Learn Powershell | Achieve More

  12. Greg says:

    Thanks for your reply, I tried that out, and I get the same error. Been doing some more reading, and apparently need to deal with the device’s Certificate

  13. Greg says:

    Hi,

    Firstly, I am new to programming and scripting.

    Will the above method work when querying network devices such as a bluecoat? if I try it, i get an error –

    PS C:\Users\Greg> $page = (new-object net.webclient).DownloadString(“https:
    //testbc:8082/ContentFilter/TestUrl/Verbose/rotten.com”)
    Exception calling “DownloadString” with “1” argument(s): “The underlying connec
    tion was closed: Could not establish trust relationship for the SSL/TLS secure
    channel.”
    At line:1 char:50
    + $page = (new-object net.webclient).DownloadString <<<< ("https://testbc01:
    8082/ContentFilter/TestUrl/Verbose/rotten.com")
    + CategoryInfo : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : DotNetMethodException

    If i browse to the same url using mozilla it works fine. Although I do need to accept the invalid security certificate and this device needs authentication. How can i work around this?

    • Boe Prox says:

      Try this and see what happens:

      $page = (new-object net.webclient)
      $page.UseDefaultCredentials = $True
      $Page.DownloadString(“https://testbc:8082/ContentFilter/TestUrl/Verbose/rotten.com”)

      Sometimes you have to set the UseDefaultCredentials to True for a secure connection.

  14. ehaze says:

    Thank you for starting at square one!

    I had some of this down, but was missing the UseDefaultCredentials = $True.

  15. André says:

    Very useful and written in a clear manner. Unfortunately there seems to be a fault in the get-webpage script since it returns follow fault:
    The term ‘Get-WebPage’ is not recognized as the name of a cmdlet, function, scr
    ipt file, or operable program. Check the spelling of the name, or if a path was
    included, verify that the path is correct and try again.
    At line:1 char:12
    + Get-WebPage <<<< -url "http://www.bing.com&quot;
    + CategoryInfo : ObjectNotFound: (Get-WebPage:String) [], Command
    NotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

    I have been trying to locate the fault but sofar no succes.

    • Boe Prox says:

      Thanks for the comment!
      When you save the code to a script, such as Get-Webpage.ps1, do you then dot source the script. By that I mean from the PowerShell console, you type . .\Get-Webpage.ps1 to load the function into memory. Once you do that, then you will be able to call the Get-Webpage function.

  16. Pingback: Tweets that mention Using PowerShell to Query Web Site Information | Learn Powershell | Achieve More -- Topsy.com

  17. Matt Bramer says:

    Excellent write up. Can’t wait to try these out. Thanks for sharing!

    Cheers,
    Matt

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s