USAGE for v5.13.3:
p4verify.sh [<instance>] [-N] [-nu] [-nr] [-ns] [-nS] [-a] [-nt] [-nz] [-o BAD|MISSING] [-chunks <ChunkSize>|-paths <paths_file>] [-w <Wait>] [-q <MaxActivePullQueueSize>] [-Q MaxTotalPullQueueSize] [-recent] [-dlf <depot_list_file>] [-I|-ignores <regex_ignores_file>] [-Ocache] [-n] [-L <log>] [-v] [-d] [-D]
or
p4verify.sh -h|-man
DESCRIPTION:
This script performs a 'p4 verify' of all submitted and shelved versioned
files in depots of all types except 'remote' and 'archive' type depots.
If run on a replica, it schedules archive failures for transfer to the
replica.
OPTIONS:
<instance>
Specify the SDP instance. If not specified, the SDP_INSTANCE
environment variable is used instead. If the instance is not
defined by a parameter and SDP_INSTANCE is not defined, p4verify.sh
exists immediately with an error message.
-N Specify '-N' (Notify Only On Failure) to disable the default behavior
which will always send a notification which includes a report of the p4
verify status. Specifying '-N' which change the behavior to only send
a notification if there is an error during the p4 verify execution.
Notification methods are email, AWS SNS, and PagerDuty. Details on
configuration can be found in the SDP documentation. Providing the
environment variable NOTIFY_ONLY_ON_FAILURE=1 is equivalent to the '-N'
command line argument.
-nu Specify '-nu' (No Unload) to skip verification of the singleton depot
of type 'unload' (if created). The 'unload' depot is verified
by default.
-nr Specify '-nr' (No Regular) to skip verification of regular submitted
archive files. The '-nr' option is not compatible with '-recent'.
Regular submitted archive files are verified by default.
-ns Specify '-ns' (No Spec Depot) to skip verification of singleton depot
of type 'spec' (if created). The 'spec' depot is verified by default.
-nS Specify '-nS' (No Shelves) to skip verification of shelved archive
files, i.e. to skip the 'p4 verify -qS'.
-a Specify '-a' (Archive Depots) to do verification of depots of type
'archive'. Depots of type 'archive' are not verified by default, as
archive depots are often physically removed from the server's
storage subsystem for long-term cold storage.
-nt Specify the '-nt' option to avoid passing the '-t' flag to 'p4 verify'
on a replica. By default, p4verify.sh detects if it is running on a
replica, and if so automatically applies the '-t' flag to 'p4 verify'.
That causes the replica to attempt to self-heal, as files that fail
verification are scheduled for transfer from the P4TARGET server. This
default behavior results in 'Transfer scheduled' messages in the log,
and MISSING/BAD files are listed as 'info:' rather than 'error:'. There
is no clear indication of whether or which of the scheduled transfers
complete successfully, and so there may be a mix of transient/correctable
and "real"/persistent transfer errors for files that are also BAD/MISSING
on the master server. Specify '-nt' to ensure the log contains a list
of files that currently fail a 'p4 verify' without attempting to transfer
them from the master.
-nz Specify '-nt' to skip the gzip of the old log file. By default, if a
log with the default name or the name specified with '-L' exists at the
start of processing, the old log is rotated and gzipped. With this option
the old log is not zipped when rotated.
-o BAD|MISSING
Specify '-o MISSING' to check only whether expected archive files exist
or not, skipping the checksum calculation of existing files. This results
in dramatically faster, if less comprehensive, verification. This
is particularly well suited when verification is being used to schedule
archive file transfers of missing files on replicas. This translates into
passing the '--only MISSING' option to 'p4 verify'.
Specify '-o BAD' to check only for BAD revisions. This translates into
passing the '--only BAD' option to 'p4 verify'.
This option requires p4d to be 2021.1 or newer. For older p4d versions,
this option is silently ignored.
-chunks <ChunkSize>
Specify the maximum amount of content by size to verify at once. If
this is specified, the depot_verify_chunks.py script is used to
break up depots into chunks of a given size, e.g. 100M or 4G.
The <ChunkSize> parameter must be a size value valid to pass to the
depot_verify_chunks.py script with the '-m' option. That is,
specifying '-chunks 200M' translates to calling depot_chunks_verify.sh
with '-m 200M'.
This requires the perforce-p4python3 module to be installed and the
python3 in the PATH must be the correct one that uses the P4 module.
Using '-chunks' is likely to result in a significantlly slower overall
verify operation, though chuking can make it less impactful when it
runs. Using the '-chunks' option may be necessary on very large data
sets, e.g. if there insufficient resources to process the largest
depots.
The '-recent' and '-chunks' options are mutually exclusive.
The '-chunks' and '-paths' options can be used together; see the
description of the '-paths' option below.
Chunking logic applies only in depots of type 'stream' or 'local'.
-paths <paths_file>
Specify a file containing a list of depot paths to verify, with one
line per entry. Valid entries in the file start with '//', e.g.
//mydepot/main/src/...
In this example, when //mydepot depot is processed, only specified
paths will be verified. All other depots will be processed in full.
To verify only specified paths, combine '-paths <paths_file>' with
'-dlf <depot_list_file>' where the depot list file contains only
'mydepot' (per the example above).
The '-recent' and '-paths' options are mutually exclusive.
The '-chunks' and '-paths' options can be used together for combined
effects. If both options are specified, depots that contain specified
paths are chunked based on the specified paths rather than the entire
depot, and other paths in that depot are not processed. Depots that
do not have any specified paths listed in the <paths_file> are
chunked at the top/depot level directory.
Paths specified must be in depots of type 'stream' or 'local'.
-w <Wait>
Specify the '-w' option, where <Wait> is a positive integer
indicating the number of seconds to sleep between individual calls
to 'p4 verify' commands. For example, specifying '-w 300' results
in a delay of 5 minutes between verify commands.
This can be used with '-chunks' to inject a delay between chunked
depot paths. Otherwise, the delay is injected between each depot
processed. This can significantly lengthen the overall duration
of 'p4verify.sh' operation, but can also spread out the resource
consumption load on a server machine.
If shelves are procossed (regardless of whether '-chunks' is used),
the delay is injected between each individual shelved changelist, as
shelved changes are verified one changelist at a time. For data sets
with a large number of shelves, it may be be wise to process shelves
separately from submitted files if '-w' is used, a delay value to
apply between depots may be different from that applied to
individual changelists.
See the '-q' option for a description of how '-q' and '-w' can be
used together.
-q <MaxActivePullQueueSize>
Specify the '-q' option, where <MaxActivePullQueueSize> is a positive
integer indicating the maximum number of active pulls allowed before
a 'p4 verify' command will be executed to transfer archives.
The absolute maximum number of possible active pulls is affected by
the number of 'startup.N' threads configured to pull archives files,
and whether those threads indicate batching.
The threads that pull archive files are those that configured to use
the 'pull' command the '-u' option. Typically, a small number of pull
threads are configured, between 2 and 10 or perhaps 20.
If '-q 1' is specified, new 'p4 verify' commands will only be run
when the active pull queue is quiet. Specifying a too-high value,
e.g. '-q 50' if only 3 'pull -u' archive pull threads are configured,
will be ineffective, as the active pull threads will never exceed
3 (let alone 50).
The current active pull queue on a replica is reported by:
p4 -ztag -F %replicaTransfersActive% pull -ls
This option can be useful if using this p4verify.sh script to pull
many or even all archives on a new replica server machine from its
target server. The injected delays can give the server time to transfer
archives scheduled in one call to 'p4 verify' before calling it again
with the goal of avoidng overloading the pull queue.
If '-w' and '-q' options are both used, the delay specified by '-w'
is ignored unless the active pull queue size is greater than or equal
to the specified maximum active pull queue size. The '-w' then
essentially determines how frequently the 'p4 pull -ls' is run to
check the active pull queue size. A reasonable set of values might
be '-q 1 -w 10'.
The '-q' option in mutually exclusive with '-nt'.
The '-q' option in mutually exclusive with '-Q'.
-Q <MaxTotalPullQueueSize>
Specify the '-Q' option, where <MaxTotalPullQueueSize> is a positive
integer indicating the maximum number of total pulls allowed before
a 'p4 verify' command will be executed to transfer archives.
In certain scenarios, the pull queue can become quite massive. For
example, if a fresh standby replica is seeded from a checkpoint
but has no archive files, and then a 'p4verify.sh' is run, the
verify will schedule all files to be transferred, perhaps millions.
If the pull queue gets too large, it can impact metadata replication.
Setting this value may help mitigate issues related to scheduling
too many archives pulls at once, by delaying scheduling new archive
pulls until enough previously scheduled pulls are completed.
This option can be useful in such scenarios, if this p4verify.sh script
is used to pull many or even all archives on a new replica server machine
from its target server. The injected delays can give the server time to
transfer archives scheduled in one call to 'p4 verify' before calling it
again with the goal of avoidng overloading the pull queue.
If individual depots contain large numbers of files, such that
a verify on a single depot will schedule too many files to be
transferred at once, it may be necessary to combine this option with
the '-chunks' option to avoid overloading the transfer queue.
**WARNING**: If there are files that cannot be tranferred from the
replica's target server, the value of '-Q' must be set to higher than
that number, or an infinite loop may occur. For example, if there are
500 permanent "legacy" verify errors on the commit server from 10
years ago that have long since been abandoned, those files can never
be transferred to any replica. Running p4verify.sh on the replica will
cause those files to be scheduled, but as they cannot be pulled, they
will land in the total pull queue. In this scenario, the value set
with '-Q' must be greater than 500, or an infinite loop is possible.
Specify '-Q 0' to disable checking the total pull queue.
The current total pull queue on a replica is reported by:
p4 -ztag -F %replicaTransfersTotal% pull -ls
This option can be useful if using this p4verify.sh script to pull
many or even all archives on a new replica server machine from its
target server. The injected delays can give the server time to transfer
archives scheduled in one call to 'p4 verify' before calling it again
with the goal of avoidng overloading the pull queue.
If '-w' and '-Q' options are both used, the delay specified by '-w'
is ignored unless the total pull queue size is greater than or equal
to the specified maximum total pull queue size. The '-w' then
essentially determines how frequently the 'p4 pull -ls' is run to
check the total pull queue size. A reasonable set of values might
be '-q 50000 -w 10'.
The '-Q' option in mutually exclusive with '-nt'.
The '-Q' option in mutually exclusive with '-q'.
-recent
Specify that only recent changelists should be verified.
The $SDP_RECENT_CHANGES_TO_VERIFY variable defines how many
changelists are considered recent; the default is 200.
If the default is not appropriate for your site, add
"export SDP_RECENT_CHANGES_TO_VERIFY" to /p4/common/config/p4_N.vars to
change the default for an instance, or to /p4/common/bin/p4_vars to
change it globally. If $SDP_RECENT_CHANGES_TO_VERIFY is unset, the
default is 200.
When -recent is used, neither shelves nor files in the unload depot
are verified.
-dlf <depot_list_file>
Specify a file containing a list of depots to process in the desired
order. By default, all depots reported by 'p4 depots', which
effectively results in depots processed in alphabetical order.
This can be useful in time-sensitive situations where the order
of processing can be prioritized, and/or to prevent processing
certain depots.
The format fo the depot list file is straighforward, one line per
depot, without the '//' nor trailling /..., so a list might look
like this sample:
ProjA
ProjB
spec
.swarm
unload
archive
ProjC
Blank lines and lines starting with a '#' are treated as comments and
ignored.
WARNING: This is not intended to be the primary method of verification,
because it would be easy to forget to add new depots to the list file.
If the depot list file is not readable, processing aborts.
-ignores <regex_ignores_file>
Specify the 'verify ignores' file, a file containing a series of
regular expression patterns representing files or file revisions
to ignore when scanning for verify errors. Errors matching the
pattern will be suppressed from the output captured in the log,
and will not be considered a verification error.
If the '-ignores' is not specified, the default verify ignores
file is:
/p4/common/config/p4verify.N.ignores
where 'N' is the SDP instance name. If this file exists, it is
considered the 'verify ignores' file.
Specify '-ignores none' to avoid processing the standard ignores
file.
The patterns can be specific files, specific file paths, or broader
patterns (e.g. in the case of entirely abandoned depots). The file
provided is passed as the '-f <file>' option to the 'grep' utility,
and is expected to contain a series of one-line entries, each
containing an expression to exclude from being considered as verify
errors reported by this script.
You can test your expression by first using it with grep to
ensure it suppresses errors by using a command like this sample,
providing an older log from this script that contains errors to
be suppressed:
grep -Ev -f /path/to/regex_file /path/to/old/p4verify.log
If your server is case-sensitive, change that command to use '-i':
grep -Evi -f /path/to/regex_file /path/to/old/p4verify.log
This sample entry ignores a single file revision:
//Alpha/main/docs/Expenses from February 1999.xls#3
This sample entry ignores all revisions of a single file:
//Alpha/main/docs/Expenses from February 1999.xls
This sample entry ignores all entries in the spec depot
related to client specs:
//spec/client
This sample uses the MD5 checksum from the verify error, just
to illustrate that this can be used as an alternative to
specifying file paths:
D34989BFB8D9B0FB9866C4A604A05410
This sample ignores BAD! (but not MISSING!) errors under the
//Beta/main/src directory tree:
//Beta/main/src/.* BAD!
WARNING: Ensure that the regex file provided does NOT contain
any blank lines or comments. The file should contain only tested
regex patterns.
This option is intended to provide a way to ignore unrecoverably lost
file revisions from things like past infrastructure failures, for
which search and recovery efforts have been abandoned. This option
subtly changes the question answered by this script from "Are there any
verify errors?" to "Are there any new verify errors, errors we don't
already know about?"
WARNING: This option is not intended to be incorporated into the primary
method of verification, because ignoring archive errors in this script
does not solve the problem at its source. Ideally, the root cause of
the verify errors should be addressed by recovering lost archives,
injecting replacement content, or other means. So long as verify errors
remain, even if ignored by this option, users attempting to access the
revisions will still see Librarian errors, and replicas will encounter
errors trying to pull the missing archives. This option could increase
the risk that such revisions are never dealt with.
-Ocache
Specify '-Ocache' to attempt a verification on a replica configured
with a 'lbr.replication' replication configuration setting value
of 'cache'. By default, if the 'lbr.replication' configurable is
set to 'cache', this script aborts, as replication of such a depot
will schedule transfers that are likely unintended. This is a
safety feature.
The 'cache' mode is generally used on replicas or edge servers with
limited disk space. Because running a verify will cause transfers
of any missing files, this could result in filling up the disk.
Use of '-Ocache' is strongly discouraged unless combined with
other options to ensure that only targeted paths are scheduled
for transfer.
-v Verbose. Show output of verify attempts, which is suppressed by default.
Setting SDP_SHOW_LOG=1 in the shell environment has the same effect as -v.
The default behavior of this script is to generate no terminal output,
but instead to write output into a log file -- see LOGGING below. If
'-v' is specified, the generated log is sent to stdout at the end of
processing. This flag is not recommended for routine cron operation or
for large data sets.
The -chunks and -recent options are mutually exclusive.
-L <log>
Specify the log file to use. The default is /p4/N/logs/p4verify.log
Log rotation and old log cleanup logic does not apply to log files
specified with -L. Thus, using -L is not recommended for routine
scheduled operation, e.g. via crontab.
DEBUGGING OPTIONS:
-n No-Operation (NO_OP) mode, for debugging.
Display certain commands that would be executed without executing
them. When '-n' is used, commands that might take a long time to
run or affect data are only displayed.
Even in '-n' mode, some information-gathering commands such as
listing shelved CLs are executed, which may cause the script to take
a bit of time to run on a large data set even in dry run mode.
-d Specify that debug messages should be displayed.
-D Use bash 'set -x' extreme debugging verbosity, and imply '-d'.
-L off
The special value '-L off' disables logging. This can only be used
with '-n' for debugging.
HELP OPTIONS:
-h Display short help message
-man Display man-style help message
EXAMPLES:
Example 1: Full Verify
This script is typically called via cron with only the instance
parameter as an argument, e.g.:
p4verify.sh 1
Example 2: Fast Verify
A "fast" verify is one in which only the check for MISSING archives
is done, while the resource-intensive checksum calculation of
potentially BAD existing archives is skipped. This is especially
useful when used on a replica.
p4verify.sh 1 -o MISSING
Example 3: Fast and Recent Verify
The '-o MISSING' and '-recent' flags can be combined for a very
fast check. This check might be incorporated into a failover
procedure.
p4verify.sh 1 -o MISSING -recent
Example 4: Submitted Files Only
This will verify only use submitted files, ignoring shelves and the
spec and unload depots, putting the results in a specified log:
p4verify.sh 1 -ns -nS -nu -L -L /p4/1/logs/p4verify.submitted.log
Example 5: Shelved Files Only
This will verify only use submitted files, ignoring shelves and the
spec and unload depots, putting them in a specified log:
p4verify.sh 1 -nr -ns -nu -L /p4/1/logs/p4verify.shelved.log
Example 6: A Dry Run
The '-n' option can be used for a dry run. Output may also be
displayed to the screen ('-v') for a dry run and the log file optionally
discarded:
p4verify.sh 1 -n -nS -L off -v
Example 7: Archive File Load for New Replica
The p4verify.sh script can be used to schedule transfers of a large
number of files from a replica. When doing so, however, overloading
the new replicas pull queue with too many files may impact metadata
replication. This can be addressed by combining a variety of
options, such as '-chunks' and '-Q'. For example:
p4verify.sh 1 -chunks 200M -Q 10000 -w 20 -o MISSING
NOHUP USAGE:
Because archive verification is typically a long running task,
it is advisable to use 'nohup' to call each command, and combine
that by running the command as a background process. Alternately,
use 'screen' or similar.
Any of the examples above can be used with 'nohup', without output
redirected to /dev/null (i.e. to "the void", as this script handles
logging and output redirection).
To use 'nohup', start the command line with 'nohup', and then after
the command, add this text exactly:
< /dev/null > /dev/null 2>&1 &
As a example, Example 2 above, called with nohup, would look like:
nohup /p4/common/bin/p4verify.sh 1 -o MISSING < /dev/null > /dev/null 2>&1 &
With the ampersand '&' at the end, the command will appear to return
immediately as the process continues to run in the background.
Then optionally monitor the log:
tail -f /p4/1/logs/p4verify.log
LOGGING:
This script generates no output by default. All (stdout and stderr) is
logged to /p4/N/logs/p4verify.log.
The exception is usage errors, which result an error being sent to
stderr followed usage info on stdout, followed by an immediate exit.
NOTIFICATIONS:
In addition to logging, a short summary of the verify is sent as a
notification. The summary is reliably short even if the output of the
verifications done by this script results in a large log file.
There are two notification schemes with this script:
* Email notification is always attempted.
* AWS SNS notification is attempted if the SNS_ALERT_TOPIC_ARN custom
setting is defined. This is typically set in:
/p4/common/site/config/p4_N.vars.local
TIMING:
The log file captures various timing information, including the
time required to verify each depot, or each chunk or path if
'-paths' or '-chunks' are used. The time to verify shelves
in all depots is reported separately from submitted files.
Timing indications all start with the text 'Time: ' on the beginning
of a line of output in the log file, and can be extrated with a
command like this example (adjusting the log file name as needed):
grep ^Time: /p4/1/logs/p4verify.log
EXIT CODES:
An exit code of 0 indicates no errors were encountered attempting to
perform verifications, AND that all verifications attempted
reported no problems.
A exit status of 1 indicates that verifications could not be
attempted for some reason.
A exit status of 2 indicates that verifications were successfully
performed, but that problems such as BAD or MISSING files
were detected, or else system limits prevented verification.