Script to List File Space by Date

After setting up a Network Video Recorder recently (using Milestone XProtect Essentials software), I needed a way to add up and compare how much disk space is being used for each day of recordings. Without this information, there is no easy way to see what effect various settings have on disk space (e.g. frames per second, motion sensing, etc.). And because there are often tens of thousands of files per day, Windows Explorer balks at filtering the files by date and summing their space.

Here’s a custom PowerShell script to go through all the files in a folder and sum their sizes by date. Only the first parameter, the folder to evaluate, is required. Other parameters let you specify display units (default is GB), whether to include subfolders (default is True), and specify filename and date filters.

Sample Output

Here’s an actual evaluation of a 4TB volume set to retain 15 days of video:

FolderToEvaluate:   E:\MediaDatabase
IncludeSubfolders:  True
IncludePatterns:    *.*
MinDateTime:        01/01/0001 00:00:00
MaxDateTime:        12/31/9999 23:59:59

Date       Total Size   Count Avg. Size
---------- ----------   ----- ---------
01/26/2017    13.3 GB   1,133     12 MB
01/27/2017   117.6 GB  24,423      5 MB
01/28/2017   186.4 GB  51,046      4 MB
01/29/2017   197.6 GB  56,639      4 MB
01/30/2017   179.1 GB  49,945      4 MB
01/31/2017   246.0 GB  60,129      4 MB
02/01/2017   287.3 GB  69,614      4 MB
02/02/2017   232.5 GB  61,035      4 MB
02/03/2017   171.9 GB  48,650      4 MB
02/04/2017   179.4 GB  51,118      4 MB
02/05/2017   189.6 GB  52,592      4 MB
02/06/2017   206.1 GB  77,955      3 MB
02/07/2017   264.9 GB 108,529      2 MB
02/08/2017   296.9 GB 121,659      2 MB
02/09/2017   329.6 GB 135,037      2 MB
02/10/2017   255.0 GB 104,497      2 MB
02/11/2017   116.7 GB  47,798      2 MB

Matching files:     1,121,799
Total size:         3,725,687,629,210 bytes ( 3,469.8 GB )
Average size:       3,321,172 bytes ( 3 MB )

That’s odd…I decreased frames per second on February 7, and both daily file count and disk space increased? I’ll have to look into that.

By the way, it took about 18.5 minutes to run that script to tally 1.08 million files.

The Script

Here’s the script. Copy and paste to a file, e.g. ListFileSpaceByDate.ps1:

<# .Synopsis   Given a folder name, add up disk space used by files, summarized by date.   Optionally include subfolders.     Optionally filter by file names and modified dates.   Copyright (c) 2017 by MCB Systems. All rights reserved.   Free for personal or commercial use.  May not be sold.   No warranties.  Use at your own risk.   .Notes     Name:       MCB.ListFileSpaceByDate.ps1     Author:     Mark Berry, MCB Systems     Created:    02/10/2017     Last Edit:  02/11/2017   Changes:   02/11/2017 Add average file size column and grand total. .Parameter FolderToEvaluate     The folder containing files to to check.   Default:  none .Parameter DisplayUnits     Specify KB, MB, or GB for display units, or empty for bytes.     GB is displayed with one decimal; others have no decimals.   Default:  "GB" .Parameter IncludeSubfolders     Whether or not to include subfolders when checking file sizes.   Default:  $true .Parameter IncludePatterns     String array of pattern(s) to include when selecting files.   Default:  "*.*" .Parameter MinDateTime     Files with Date Modified before this time stamp will not be included.   Default:  none (no minimum date) .Parameter MaxDateTime     Files with Date Modified after this time stamp will not be included.   Default:  none (no maximum date) .Parameter LogFile     Path to a log file. Required by MaxRM script player.  Not used here.   Default:  "". #>
param(
  [Parameter(Mandatory = $true,
                    Position = 0,
                    ValueFromPipelineByPropertyName = $true)]
    [String]$FolderToEvaluate,

  [Parameter(Mandatory = $false,
                    Position = 1,
                    ValueFromPipelineByPropertyName = $true)]
    [String[]]$DisplayUnits="GB",

  [Parameter(Mandatory = $false,
                    Position = 2,
                    ValueFromPipelineByPropertyName = $true)]
    [Boolean]$IncludeSubfolders=$true,

  [Parameter(Mandatory = $false,
                    Position = 3,
                    ValueFromPipelineByPropertyName = $true)]
    [String[]]$IncludePatterns="*.*",

  [Parameter(Mandatory = $false,
                    Position = 4,
                    ValueFromPipelineByPropertyName = $true)]
    [DateTime]$MinDateTime,

  [Parameter(Mandatory = $false,
                    Position = 5,
                    ValueFromPipelineByPropertyName = $true)]
    [DateTime]$MaxDateTime,

  [Parameter(Mandatory = $false,
                    Position = 6,
                    ValueFromPipelineByPropertyName = $true)]
    [String]$LogFile=""
)

$ErrFound = $false

################################################################################
# Set up parameters
################################################################################

# Set up and start stopwatch so we can print out how long it takes to run script
# http://stackoverflow.com/questions/3513650/timing-a-commands-execution-in-powershell
$StopWatch = [Diagnostics.Stopwatch]::StartNew()

# If $MinDateTime not specified, use Windows' minimum, i.e. don't filter by min date
if ([string]::IsNullOrEmpty($MinDateTime)) { $MinDateTime = [DateTime]::MinValue}
# If $MaxDateTime not specified, use Windows' maximum, i.e. don't filter by max date
if ([string]::IsNullOrEmpty($MaxDateTime)) { $MaxDateTime = [DateTime]::MaxValue}

# When user specifies max date without time, assume the user means for it to be INclusive, i.e. up to 23:59:59 on that date
if ($MaxDateTime.Hour -eq 0 -and $MaxDateTime.Minute -eq 0 -and $MaxDateTime.Second -eq 0) {
    $MaxDateTime = $MaxDateTime.AddDays(1).AddSeconds(-1) # plus 1 day minus 1 second:  23:59:59 
}

"Select files matching the following parameters and sum their sizes by Date Modified:"
""
"FolderToEvaluate:   $FolderToEvaluate"
"IncludeSubfolders:  $IncludeSubfolders"
"IncludePatterns:    $IncludePatterns"
"MinDateTime:        $MinDateTime"
"MaxDateTime:        $MaxDateTime"

################################################################################
# Process files
################################################################################

# Create a hashtable with LastWriteDate as the key (a string in dddd/yy/mm format) and the SUM of lengths as the value
# Second column of hashtable will be an array containing two elements:  total size, file count
$hashByDate = @{}
$TotalSize = 0
$TotalFileCount = 0

# Notes:
#   -force to include ReadOnly, Hidden, and System files. http://stackoverflow.com/a/26425580/550712.
#   { ! $_.PSIsContainer } excludes folders (just list files). http://superuser.com/a/150762/171670

Get-ChildItem $FolderToEvaluate -recurse -force -include $IncludePatterns `
    | Where-Object { ! $_.PSIsContainer -and $_.LastWriteTime -ge $MinDateTime -and $_.LastWriteTime -le $MaxDateTime} `
    | ForEach-Object { 
        $TotalFileCount++
        $TotalSize = $TotalSize + $_.Length
        [string]$LastWriteDate = $_.LastWriteTime.ToString("yyyy/MM/dd") # store string as yyyy/MM/dd for correct sorting
        if ( $hashByDate.ContainsKey($LastWriteDate) ) {
            # date already exists in hashtable:  add current file size to sum
            $FileSizeSum = $hashByDate.$LastWriteDate[0]
            $FileCount   = $hashByDate.$LastWriteDate[1]
            $FileSizeSum = $FileSizeSum + $_.Length
            $FileCount++
            $hashByDate.Set_Item($LastWriteDate, @($FileSizeSum, $FileCount))
        } else { 
            # date doesn't already exists in hashtable:  add new entry with current file size and count
            $hashByDate.Add($LastWriteDate, @($_.Length, 1) )
        }
    } # end of ForEach-Object

################################################################################
# Output results
################################################################################

# Print table sorted by date (the two columns in a hash table are always called Name and Value).
# Re-format the first column as MM/dd/yyyy by first converting to DateTime, then ToString.
# Format the second column with thousands separators.
# Format the average size column one "less" than the DisplayUnits, e.g. for GB total, show average MB.
# Seems inelegant to use a Switch for DisplayUnits but would be messy to embed logic in the same format-table.

switch ($DisplayUnits)
{
    "KB" { 
            $hashByDate.GetEnumerator() | Sort-Object Name `
                | format-table -autosize `
                    @{Label="Date      ";Expression={([DateTime]$_.Name).ToString("MM/dd/yyyy")}}, `
                    @{Label="Total Size";Expression={"{0:N0} KB" -f ($_.Value[0]/1KB)};align="right"}, `
                    @{Label="Count";Expression={"{0:N0}" -f $_.Value[1]};align="right"}, `
                    @{Label="Avg. Size";Expression={"{0:N0}" -f ($_.Value[0]/$_.Value[1])};align="right"}

            "Matching files:     " + "{0:N0}" -f $TotalFileCount
            "Total size:         " + "{0:N0}" -f $TotalSize + " bytes ( " + "{0:N0} KB" -f ($TotalSize/1KB) + " )"
            "Average size:       " + "{0:N0}" -f ($TotalSize/$TotalFileCount) + " bytes"
         }

    "MB" { 
            $hashByDate.GetEnumerator() | Sort-Object Name `
                | format-table -autosize `
                    @{Label="Date      ";Expression={([DateTime]$_.Name).ToString("MM/dd/yyyy")}}, `
                    @{Label="Total Size";Expression={"{0:N0} MB" -f ($_.Value[0]/1MB)};align="right"}, `
                    @{Label="Count";Expression={"{0:N0}" -f $_.Value[1]};align="right"}, `
                    @{Label="Avg. Size";Expression={"{0:N0} KB" -f ($_.Value[0]/$_.Value[1]/1KB)};align="right"}

            "Matching files:     " + "{0:N0}" -f $TotalFileCount
            "Total size:         " + "{0:N0}" -f $TotalSize + " bytes ( " + "{0:N0} MB" -f ($TotalSize/1MB) + " )"
            "Average size:       " + "{0:N0}" -f ($TotalSize/$TotalFileCount) + " bytes ( " + "{0:N0} KB" -f ($TotalSize/$TotalFileCount/1KB) + " )"
         }

    "GB" { 
            $hashByDate.GetEnumerator() | Sort-Object Name `
                | format-table -autosize `
                    @{Label="Date      ";Expression={([DateTime]$_.Name).ToString("MM/dd/yyyy")}}, `
                    @{Label="Total Size";Expression={"{0:N1} GB" -f ($_.Value[0]/1GB)};align="right"}, `
                    @{Label="Count";Expression={"{0:N0}" -f $_.Value[1]};align="right"}, `
                    @{Label="Avg. Size";Expression={"{0:N0} MB" -f ($_.Value[0]/$_.Value[1]/1MB)};align="right"}

            "Matching files:     " + "{0:N0}" -f $TotalFileCount
            "Total size:         " + "{0:N0}" -f $TotalSize + " bytes ( " + "{0:N1} GB" -f ($TotalSize/1GB) + " )"
            "Average size:       " + "{0:N0}" -f ($TotalSize/$TotalFileCount) + " bytes ( " + "{0:N0} MB" -f ($TotalSize/$TotalFileCount/1MB) + " )"
         }

    default { # includes empty value:  display exact bytes
            $hashByDate.GetEnumerator() | Sort-Object Name `
                | format-table -autosize `
                    @{Label="Date      ";Expression={([DateTime]$_.Name).ToString("MM/dd/yyyy")}}, `
                    @{Label="Total Size";Expression={"{0:N0}" -f ($_.Value[0])};align="right"}, `
                    @{Label="Count";Expression={"{0:N0}" -f $_.Value[1]};align="right"}, `
                    @{Label="Avg. Size";Expression={"{0:N0}" -f ($_.Value[0]/$_.Value[1])};align="right"}

            "Matching files:     " + "{0:N0}" -f $TotalFileCount
            "Total size:         " + "{0:N0}" -f $TotalSize + " bytes"
            "Average size:       " + "{0:N0}" -f ($TotalSize/$TotalFileCount) + " bytes"
         }

}

################################################################################
# Wrap-up
################################################################################

# Conclude with local time
"`n======================================================"
'Local Machine Time:  ' + (Get-Date -Format G)

# Stop the stopwatch and show the elapsed time
$StopWatch.Stop()
"Script execution took $($StopWatch.Elapsed)."
""

if ($ErrFound) {
  $ExitCode = 1001 # Cause script to report failure in MaxRM dashboard
}
else {
  $ExitCode = 0
}
"Exit Code: " + $ExitCode
Exit $ExitCode

Sample Execution

Here’s an example of running the script with all parameters (except the unused -LogFile) specified, limiting the date range to one month. Put this on one line:

powershell.exe -NoLogo -NoProfile -NonInteractive ".\ListFileSpaceByDate.ps1" -FolderToEvaluate "E:\MediaDatabase" -IncludeSubfolders $true -IncludePatterns "*.*" -DisplayUnits "MB" -MinDateTime "01/01/2019" -MaxDateTime "01/31/2019"

Known Issue If no files are found, the script will abort with a divide by zero error when trying to compute average file size.

7 thoughts on “Script to List File Space by Date

  1. Art Bergquist

    Thanks so much, Mark!

    BTW, you actually mean “Copy and paste”, not “Cut and paste”, in the instruction:

    “Cut and paste to a file, e.g. ListFileSpaceByDate.ps1:”

    Also

    Thanks again.

  2. Brandon Morales

    Hi,

    How to change the min date and max date range?

  3. Mark Berry Post author

    @Brandon, you can use the -MinDateTime and -MaxDateTime parameters. I’ve added an example to the end of the post above.

  4. Steve

    Hey Mark,

    I am running your script on an insanely large directory. The script just stopped running this morning, no errors or anything in the logs. Do you know if there are limitations on the size of the data set it collects?

    Thanks in advance

  5. Mark Berry Post author

    @Steve, I’m not aware of any limitations. How are you running it? If in Task Scheduler, it may be hitting the time limit of the task. Try running it interactively from a command prompt. Out of curiosity, how much data are you processing (total number of files, total TB)?

  6. Jeff

    Mark, really great, useful script. Nicely done! Thanks for sharing

Leave a Reply

Your email address will not be published. Required fields are marked *

Notify me of followup comments via e-mail. You can also subscribe without commenting.