SDP-269

tom_tyler (C. Thomas Tyler)
C. Thomas Tyler created this job , modified by C. Thomas Tyler
Closed
Optimize journalPrefix values for master, edge, standby, replica.

For master/commit servers only, journalPrefix will remain
/p4/N/checkpoints/p4_N.  So the /p4/N/checkpoints folder, regardless
of what machine it appears on, will represent checkpoints created on
th master/commit server.

For standby replicas only (using journalcopy), journalPrefix will be
/p4/N/journals.rep/p4_N, on /hxlogs volume.  This is because
journalcopy replicas uniquely use journalPrefix to determine where to
write active journals.  This is the current behavior.

Warning: Taking a checkpoint with this setting will cause it to go to
/hxlogs, meaning it may not be big enough to store checkpoints, and
also isn’t typically backed up.  So, don’t take checkpoints on
journalcopy replicas; this needs to be in the docs.

For edge servers and other replica types, journalPrefix will look
something like /p4/N/checkpoints.bos_edge/p4_N.bos_edge, on the
/hxdepots volume.  So there should be plenty of space to create
checkpoints, and no big performance implications since active journals
are written to P4JOURNAL file and rely on jounalPefix only for
checkpoint creation and journal rotation.
23776Removed obsolete code from recreate_db_checkpoint.sh and rotate_journal.sh.
23237Fixes and Enhancements:
* Enabled daily_checkpoint.sh operate on edge servers, to
keep /p4/N/offline_db current on those hosts for site-local
recovery w/o requiring a site-local replica (though having
a site-local replica can still be useful).
* Disabled live_checkpoint.sh for edge servers.
* More fully support topologies using edge severs, in both
geographically distributed and horizaontal scaling "wokspace
server" solutions.
* Fix broken EDGESERVER value definition.
* Modified name of SDP counter that gets set when a checkpoint is taken
to incorporate ServerID, so now the counter name will look like
lastSDPCheckpoint.master.1, or lastSDPCheckpoint.p4d_edge_sfo, rather
than just lastSDPCheckpoint.

There will be multiple such counters in a topology that uses edge
servers, and/or which takes checkpoints on replicas.

* Added comments for all functions.

For the master server, journalPrefix remains:
/p4/N/checkpoints/p4_N

The /p4/N/checkpoints is reserved for writing by the
master/commit server only.

For non-standby (possibly filtered) replicas and edge serves,
journalPrefix is:
/p4/N/checkpoints.<ShortServerID>/p4_N.<ShortServerID>

Here, ShortServerID is just the ServerID with the 'p4d_' prefix
trimmed, since it is redundant in this context.  See mkrep.sh,
which enshines a ServerID (server spec) naming standard, with
values like 'p4d_fr_bos' (forwarding replica in Boston) and
p4d_edge_blr (Edge server in Bangalore).  So the journalPrefix
for the p4d_edge_bos replica would be:
/p4/N/checkpoints.edge_bos/p4_N.edge_bos

For "standby" (aka journalcopy) replicas, journalPrefix is set
to /p4/N/journals.rep. which is written to the $LOGS volume, due
to the nature of standby replicas using journalPrefix to write
active server logs to pre-rotated journals.

Some take-away to be updated in docs:
* The /p4/N/checkpoints folder must be reserved for checkpoints that
originate on the master. It should be safe to rsync this folder
(with --delete if desired) to any replica or edge server.  This is
consistent with the current SDP.
* I want to change 'journals.rep' to 'checkpoints.<ShortServerID>'
for non-standby replicas, to ensure that checkpoints and journals
taken on those hosts are written to a volume where they are backed
up.
* In sites with multiple edge serves, some sharing achive files
('workspace servers'), multiple edge servers will share the same
SAN. So we one checkpoints dir per ServerID, and we want that
dir to be on the /hxdepots volume.

Note that the journalPrefix for replicas was a fixed /p4/N/journals.rep.
This was on the /hxlogs volume - a presumably fast-for-writes volume,
but typically NOT backed up and not very large. This change puts it
under /p4/N/checkpoints.* for edge servers and non-standby replicas,
but ensures other replica types and edge servers can generate
checkpoints to a location that is backed up and has plenty of storage
capacity.  For standby replicas only (which cannot be filtered),
the journalPrefix remains /p4/N/journals.rep on the /hxlogs volume.
23778Removed obsolete code from recreate_db_checkpoint.sh and rotate_journal.sh.
23266Fixes and Enhancements:
* Enabled daily_checkpoint.sh operate on edge servers, to
keep /p4/N/offline_db current on those hosts for site-local
recovery w/o requiring a site-local replica (though having
a site-local replica can still be useful).
* Disabled live_checkpoint.sh for edge servers.
* More fully support topologies using edge severs, in both
geographically distributed and horizaontal scaling "wokspace
server" solutions.
* Fix broken EDGESERVER value definition.
* Modified name of SDP counter that gets set when a checkpoint is taken
to incorporate ServerID, so now the counter name will look like
lastSDPCheckpoint.master.1, or lastSDPCheckpoint.p4d_edge_sfo, rather
than just lastSDPCheckpoint.

There will be multiple such counters in a topology that uses edge
servers, and/or which takes checkpoints on replicas.

* Added comments for all functions.

For the master server, journalPrefix remains:
/p4/N/checkpoints/p4_N

The /p4/N/checkpoints is reserved for writing by the
master/commit server only.

For non-standby (possibly filtered) replicas and edge serves,
journalPrefix is:
/p4/N/checkpoints.<ShortServerID>/p4_N.<ShortServerID>

Here, ShortServerID is just the ServerID with the 'p4d_' prefix
trimmed, since it is redundant in this context.  See mkrep.sh,
which enshines a ServerID (server spec) naming standard, with
values like 'p4d_fr_bos' (forwarding replica in Boston) and
p4d_edge_blr (Edge server in Bangalore).  So the journalPrefix
for the p4d_edge_bos replica would be:
/p4/N/checkpoints.edge_bos/p4_N.edge_bos

For "standby" (aka journalcopy) replicas, journalPrefix is set
to /p4/N/journals.rep. which is written to the $LOGS volume, due
to the nature of standby replicas using journalPrefix to write
active server logs to pre-rotated journals.

Some take-away to be updated in docs:
* The /p4/N/checkpoints folder must be reserved for checkpoints that
originate on the master. It should be safe to rsync this folder
(with --delete if desired) to any replica or edge server.  This is
consistent with the current SDP.
* I want to change 'journals.rep' to 'checkpoints.<ShortServerID>'
for non-standby replicas, to ensure that checkpoints and journals
taken on those hosts are written to a volume where they are backed
up.
* In sites with multiple edge serves, some sharing achive files
('workspace servers'), multiple edge servers will share the same
SAN. So we one checkpoints dir per ServerID, and we want that
dir to be on the /hxdepots volume.

Note that the journalPrefix for replicas was a fixed /p4/N/journals.rep.
This was on the /hxlogs volume - a presumably fast-for-writes volume,
but typically NOT backed up and not very large. This change puts it
under /p4/N/checkpoints.* for edge servers and non-standby replicas,
but ensures other replica types and edge servers can generate
checkpoints to a location that is backed up and has plenty of storage
capacity.  For standby replicas only (which cannot be filtered),
the journalPrefix remains /p4/N/journals.rep on the /hxlogs volume.
  • Details
  • Comments -
Status
Closed
Project
perforce-software-sdp
Severity
C
Reported By
C. Thomas Tyler
Reported Date
Modified By
C. Thomas Tyler
Modified Date
Owned By
tom_tyler
Component
core-unix
Type
Feature