Creating a Report of Log File Data using Regular Expressions, Arrays and Hash Table

As I was checking my email early in the morning, I saw an email from co-worker asking me to write a script to parse through some logs that were generated from another script that listed the date the script is ran each day. In short, the script checks the network connectivity of some systems. If the system is up, nothing else happens, otherwise the system name is written to the log. My task was to go through each of the log files for the past week and month and grab each hostname and find out how many times the system was showing as down. Simple enough I thought, I can whip up a PowerShell script and go through those logs in a quick amount of time and use a hash table to collect the data I needed.

Here is an example of the logfile (the hostnames have been changed to protect the innocent):

Current Date is 02/25/2011 20:09:46

TEST1-568956
TEST1-568923

Current Date is 02/25/2011 20:09:46

TEST2-895645
TEST2-568956

For the sake of testing at home, I wrote a small script to create some log files to match what I needed to work with.

Code

$workstations = @('TEST1-788989','TEST1-568923','TEST1-568956','TEST1-568956')
$workstations1 = @('TEST2-568956','TEST2-23589','TEST2-895645','TEST2-56895623')
1..5 | % {
@"
Current Date is $((Get-Date).AddDays($_))

$($workstations | Get-Random)
$($workstations | Get-Random)

Current Date is $((Get-Date).AddDays($_))

$($workstations1 | Get-Random)
$($workstations1 | Get-Random)
"@ | Out-File "log$($_).txt"
}

This uses a Here-String to generate the body for each log file. Now that I have my log files, I needed to begin the process of being able to collect all of the log files before parsing through them for the information I needed.

For my code, I decided to use a hash table since all I was looking for was the workstation and the number of times it was down based on the log files. I create an empty hash table using the following code:

$hash = @{}

I also enlist the use of a Switch statement using the –regex and –file to parse through the logs for some specific values, namely the workstation name. Using –file will allow you to supply a filename to read through it line by line and parse through the lines for the information required. This is different than using the Get-Content Cmdlet as it only reads one line at a time rather than using Get-Content to grab all of the contents of the file and store it into memory.  Think about grabbing all of the contents of a 1GB file using Get-Content. That is not only going to be slow, it will also bog down your system.

Using the –regex allows me to perform a regular expression match of a specific item in each line that I am going through on the log file. I did write a post on regular expressions here. Since I know that the workstations follow a specific naming convention of “Test-1” and “Test-2”, I am figure out the regular expression that I need to use for each switch statement to locate the workstations. I can use “^Test-1\w+” and “^Test-2\w+”.

Once I find a workstation, I then check to see if the workstation name already exists in my hash table and if it doesn’t exist, adds the workstation along with a counter of 1 to show that this is the first instance of that system being down. Otherwise, the hash counter will be incremented by 1. The $_ represents the workstation name that matched the regular expression that was used to locate that specific workstation name.

If ($hash.ContainsKey ($_) {
	$hash[$_]++
	}
Else {
	$hash.add($_,1)
	}

When it was all said and done, I also had to send an email to some customers listed the contents of the hash table. The problem with a hash table is that you if you are going to write it out as a string using Write-Host or sending it in an email is that when you try to do the following things, it will not come out quite like you expect it to…

Write-Host $hash
Write-Host $hash.GetEnumerator()
Write-Host $($hash)
Write-Host $($hash.GetEnumerator())

Capture

A whole lot of formatting types come out. Not exactly what I call reportable material. Using the following code will give me the proper format that I am looking for.

Write-Host $($hash.GetEnumerator() | Out-String)

I wrap the commands in a $() so the expression can be evaluated when I call it with the Write-Host cmdlet and the Out-String Cmdlet converts the output of the GetEnumerator()   method (which is used to get all of the entries) into a string format.

Capture

 

Code

<#  
.SYNOPSIS  
    Parses the client status log files to report down statuses for a month
.DESCRIPTION
    Parses the client status log files to report down statuses for a month. 
    Email is then sent to users listing report
.NOTES  
    Name: Status_Report.PS1
    Author: Boe Prox
    DateCreated: 24Feb2011        

#> 
Begin {
    #Create header dates
    $begin = ((Get-Date).AddMonths(-1)).Toshortdatestring()
    $today = (Get-Date).Toshortdatestring()
    
    #Create empty hash table
    $hash = @{}
    
    #Gather logs
    $logs = GCI -Filter *.txt
    }
Process {
    #Iterate through each log
    ForEach ($log in $logs) {
        Write-Verbose "$($log.FullName)"
        Switch -regex -file $log.FullName {
            "^Current\sDate/Time:\s\w+" {
                Write-Verbose "Date Run: $($i)"
                }
            "^TEST1-\w+" {
                If ($hash.ContainsKey($_)) {
                    #Increment value
                    $hash[$_]++
                    }
                Else {
                    #Add to hash table
                    $hash.Add($_,1)
                    }               
                }
            "^TEST2-\w+" {
                If ($hash.ContainsKey($_)) {
                    #Increment value
                    $hash[$_]++
                    }
                Else {
                    #Add to hash table
                    $hash.Add($_,1)
                    }            
                } 
            Default {
                Write-Verbose "Blank Space"
                }       
            }
        }
    }
End {
    #Write header
    Write-Host "Down systems from $begin to $today"
    Write-Output $hash
    
    Send-MailMessage -to group@email.com `
        -From group@email.com `
        -smtp server.server.com `
        -Subject "Status from $begin to $today" `
        -Body "Down systems from $begin to $today `r
$(($hash.GetEnumerator() | Select @{Label = 'Workstation';Expression = {$_.Name}}, @{Label = 'Days_Down';Expression = {$_.Value}}) | out-string)"
    }

Running the script presented the following output and also sent an email to the users requesting this information:

Down systems from 1/26/2011 to 2/26/2011

Name                                 Value                                                     
—-                                       —–                                                     
TEST2-56895623                 2                                                         
TEST1-568923                   4                                                         
TEST1-568956                   5                                                         
TEST2-23589                    2                                                         
TEST1-788989                   1                                                         
TEST2-895645                   3                                                         
TEST2-568956                   3
  

Content that everything was good after sending the finished product, I went on to work on other items. Fast-forward an hour or so and I am forwarded an email asking if the actual dates can also be included in the report. I’m thinking to myself that it should be doable, I just have to re-think how I was going to accommodate that request. Crazy thoughts such as trying to merge a hash table, combining a hash table with an array among other things began to dance in my head. So before continuing on this downward spiral, I decided to take a break and return to this later on. a few hours later, I finally came up with my solution using a combination of a hash table and then creating a report using an array.

 

Code

Begin {
    $logs = GCI -Filter *.txt
    $hash = @{}
    $report = @()
    [regex]$regex = "\d{2}/\d{2}/\d{4}\s\d{1,2}:\d{2}:\d{2}"
    }
Process {
    ForEach ($log in $logs) {
        Switch -regex -file $log.fullname {
            "\d{2}/\d{2}/\d{4}\s\d{1,2}:\d{2}:\d{2}" {
                $date = $regex.matches($_) | Select -ExpandProperty Value
                }
            "^TEST1-\w+" {
                If ($hash.contains($_)) {
                    $hash[$_] += $date
                    }
                Else {
                    $hash.Add($_,@($date))
                    }
                }
            "TEST2-\w+" {
                If ($hash.contains($_)) {
                    $hash[$_] += $date
                    }
                Else {
                    $hash.Add($_,@($date))
                    }           
                }
            }
        }
    ForEach ($key in $hash.keys) {
        $temp = "" | Select Workstation, DaysDown, DatesDown
        $temp.Workstation = $key
        $temp.DatesDown = $hash[$key]
        $temp.DaysDown = $hash[$key].count  
        $report += $temp  
        }
    }
End {
    Write-Output $report
    }

The output for the script is shown below:

Capture

As you can see, there are a lot of similarities between this and my original code. I am using the same type of parsing and am also parsing the date this time as that is now needed for the report and still using a hash table to save the workstation name. To parse the date using regular expressions, I decided on a different approach using the following regular expression:  “\d{2}/\d{2}/\d{4}\s\d{1,2}:\d{2}:\d{2}”  which will grab the following date format: 01/12/2011 12:12:23.  Instead of incrementing a counter for the number of days down, I have created an array of the dates that the workstation was shown as being offline. 

I wanted to make sure that the hash table was going to show an array of the dates, so I declared the array along with the creation of the hash table using the following code:

$hash.Add($_,@($date))

The $_ represents the workstation name and the $date represents the date that was parsed from the log file.  Now that the array has been declared within the hash stable, the next time that I pull the same workstation, it will add the new date to the array within the hash table using the following line of code:

$hash[$_] += $date

Again, the $_ represents the workstation value retrieved from using the regular expression and the $date shows the date.  What I did was call the key in the hash table for the workstation which allows me to then add the date to the array using the += operator. Had I not declared the Value of the hash table as an array, it would have just appended the date to the existing value instead of adding it to the collection.

 

Since this was a report being emailed out to some users, I need to convert this information over to a CSV file and then send it out as an attachment. I decided to create a tab delimited file as using the Export-CSV will not work correctly. If I use the Export-CSV Cmdlet, everything would come out correctly with the exception of the DatesDown portion and will just show you what type is residing. Instead I looped through the hash table and wrote the data to a Tab-Delimited file. The final code I used for that is below:

Code

<#  
.SYNOPSIS  
    Parses the  client status log files to report down statuses for a month
.DESCRIPTION
    Parses the  client status log files to report down statuses for a month. 
    Email is then sent to users listing report
.NOTES  
    Name: ClientStatus_Report.PS1
    Author: Boe Prox
    DateCreated: 24Feb2011        

#> 
Begin {
    #Create header dates
    Write-Verbose "Creating date strings"
    $begin = ((Get-Date).AddMonths(-1)).Toshortdatestring()
    $today = (Get-Date).Toshortdatestring()
    
    #Create empty hash table
    Write-Verbose "Creating hash table"
    $hash = @{}
    $report = @()
    
    #Create regular expression to parse dates
    [regex]$regex = "\d{2}/\d{2}/\d{4}\s\d{1,2}:\d{2}:\d{2}"
    
    #Create header for tab-delimited file
    $logfile = 'client_status.csv'
    "Workstation`tDaysDown`tDatesDown" | Out-File $logfile
    
    #Gather logs
    Write-Verbose "Gathering log files"
    $logs = GCI *.txt
    }
Process {
    #Iterate through each log
    ForEach ($log in $logs) {
        Write-Verbose "$($log.FullName)"
        Switch -regex -file $log.FullName {
            "\d{2}/\d{2}/\d{4}\s\d{1,2}:\d{2}:\d{2}" {
                Write-Verbose "Date Run: $($i)"
                
                #Parse date for log file and report
                [datetime]$Date = $regex.matches($_) | Select -ExpandProperty Value
                [string]$date = $date.ToShortDateString()
                $date = $date.Replace("\d{1,2}:\d{2}:\d{2}\s\w{2}","")
                }
            "^Test1-\w+" {
                If ($hash.ContainsKey($_)) {
                    Write-Verbose "$($_) already in table"
                    #Increment value
                    $hash[$_] += $date
                    }
                Else {
                    Write-Verbose "Adding $($_) to table"
                    #Add to hash table
                    $hash.Add($_,@($date))
                    }               
                }
            "^Test2-\w+" {
                If ($hash.ContainsKey($_)) {
                    Write-Verbose "$($_) already in table"
                    #Increment value
                    $hash[$_] += $date
                    }
                Else {
                    Write-Verbose "Adding $($_) to table"
                    #Add to hash table
                    $hash.Add($_,@($date))
                    }            
                } 
            Default {
                Write-Verbose "Blank Space"
                }       
            }
        }
        #Begin processing the values in the hash table to convert into Array
        ForEach ($key in $hash.keys | Sort) {
            Write-Verbose "Formatting dates prior to adding into CSV"
            #Format the dates to make easier to read
            If ($hash[$key] -is [array]) {
                $value = "$([string]::Join('; ',$hash[$key]))"
                }
            Else {
                $value = "$($hash[$key])"
                }
            Write-Verbose "Adding to log file"
            "$($key)`t$($value); `t$($hash[$key].count)" | Out-File -Append $logfile
            $temp = "" | Select Workstation, DaysDown, DatesDown
            $temp.Workstation = $key
            $temp.DatesDown = $hash[$key]
            $temp.DaysDown = $hash[$key].count  
            $report += $temp  
            }
    }
End {
    Write-Verbose "Displaying Report"
    Write-Host " Client Status from $begin to $today"
    Write-Output $Report | Sort Workstation | FT -auto
    Write-Verbose "Sending Email notification" 
    <# 
    Send-MailMessage -to email@email.com `
        -From email@email.com `
        -smtp '<server>' `
        -Subject " Client Status from $begin to $today" `
        -att $log `
        -BodyAsHTML "<html> Please review the attached file listing down systems from <b> $begin </b> to <b> $today </b> </html>"
    #>
    #Remove Log file
    Write-Verbose "Removing log file"
    #Remove-Item $logfile -Force
    }

Capture

I decided to append a “;” after each date, to include the cases where I only had one date so when a user would open up the file, all of the dates would be aligned to the left instead of being all the way to the right.

Using this code, I can then email the CSV file as an attachment to the customers. This same type of process can easily be used to create a report for your Windows Firewall logs, Windows Update logs or any other log that had data in it that you want to report on.  The possibilities are pretty much endless!

This just shows again what PowerShell is able to do to save a lot of manual effort going through logs to generate a report!

About Boe Prox

Microsoft Cloud and Datacenter MVP working as a SQL DBA.
This entry was posted in powershell, scripts and tagged , , , , , . Bookmark the permalink.

5 Responses to Creating a Report of Log File Data using Regular Expressions, Arrays and Hash Table

  1. mjolinor says:

    I might just do that 🙂

  2. mjolinor says:

    Nice. A couple of notes:
    1. It isn’t really necessary to check to see if the key already exists. The normal behaviour when adding an element to a hash table is the if it doesn’t already exist it will be created. If it already exists it will get the new element value added to the existing value according to the rules for op_addition for the element’s type.

    When you’re creating an array as a value, you can just cast all the added elements as arrays, and array addition will automatcially take care of keeping it all one array.

    This will get the files in the c:\windows directory, creating two hash tables with keys of the file extensions, and use one to count the number of files with that extension and another to create an array of the file names that have that extension, without ever checking to see if the key exists first:

    $ht1 = @{}
    $ht2 = @{}
    gci c:\windows|% {
    $ht1[$_.extension] ++
    $ht2[$_.extension] += @($_.name)
    }

    $ht1
    $ht2

    • Boe Prox says:

      Very cool! This was pretty much my first actual work with Hash Tables, so I figured that there would be a better way to do it. Thanks for dropping by and giving a good explanation and nice example!

      • mjolinor says:

        Because of the way op_addition works for hash tables, the one time you do have to check if a key already exists and explicitly create a new one if it doesn’t is if the value of the new key is going to be another hash table.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s