TODO for VCP *NOTE* Doing a `grep -r TODO lib bin` will find lots of small things and future feature ideas. This file will grow to include the more important of those. - Bugfixes - in converting xfree86 xc/docs/misc/Imakefile,v, VCP is delaying the creation of all branches with no changes until far too late; the branches are being rooted in the head rev instead of the proper revisions found in CVS. This indicates that VCP is not branching from the specified parent revision in every case. (gerry@perforce.com). - VCP is not treating vendor tags and imports based on them properly: it should treat them as branches (and not the cvs2p4 'import' branch if possible). - Port VCP::*::vss to use state files - Implement branching in VCP::*::vss - Port to Win32 (ie tweak IPC::Run3 to work there) - Make it so that a dest of cvs:blah:foo is either an error or is just like cvs:blah:foo/... - Prevent keyword expansion on all checkouts. Found by Thomas Quinot - Get detailed spec of legal names (filenames, branch spec/label names) for p4. * Carry executable bit through (Nick Ing-Simmons) * Make , , etc. use binary escapes when needed * VCP::Dest::cvs needs to set the binary mode properly on files it creates and checks in [[check status]] * VCP::Source::cvs needs to deduce binary modes correctly [[check status]] * Make test suite skip cvs tests if cvs is not present (like it skips p4 tests if p4 is not present). [[check status]] - there should be an option to detect collisions between underscorified tags in VCP::Dest::cvs. Perhaps enabled by default. - use the upcoming -t option for p4 filelog (expected in 2002.2) to extract times with dates. Issue a warning recommending an upgrade or enabling a metadata reading mode if not present. Warn about metadata loss when using older p4. - Review cvs2p4 and (in the guest depot) vss2p4 for gotchas they catch that we should [[STATUS: tag underscorification is done and we've checked that cvs does not limit its tag lengths]] - get p4 forms parsing to lex *exactly* like p4 does it. * VCP should record its starting directory early on and then use it as a base directory for rel2abs(). Thinking specifically of cvsroot absolutification, but the sources should be grepped. - Testing * Test the VCP::Source::cvs -r and -d options (-r *was* tested before --continue was added; now its not tested at all. Plus, -d was never tested and is not fully implemented in the RCS file scanning code. - VCP::TestUtils & all tests to using IPC::Run3, for speed and portability's sake. * use get_vcp_output() in the test suite whereever we're extracting revml. * test -r options in t/91cvs2revml.t. Need to test each of (start, end, start+end) with both rev numbers and tags - Test VCP::Source::cvs branching corner conditions, perhaps all corner conditions for this sort of thing. - when the first rev on a branch is a delete - when the first rev on a branch occurs as the first change in an incremental export - when a file is added on a branch - Test importing two bootstrap imports one on top of the other - should VCP warn when it detects this? - Feature Adds - Transfer and translate branch specs and other metadata on p4->p4 transfers. - Handle differently localized timestamps. RevML should be in GMT or use ISO8601's tz indicators. - Allow vcp to log *everything* it outputs to make it easier for the user to capture things, and log it all by default. Right now, the user needs to use some external redirect or tee utility, which might not be available. - branch_label fields in the sections to allow cvs branch labels to be carried through to p4 and, possibly, p4 branch spec names to be assigned and used as directory names for branches. - VCP::Filter::branchmap to allow elements to be altered, for instance by assigning names to p4 branches' elements given their elements (which are a path to the parent directory of each file). - VCP::Filter::usermap to allow usernames to be altered - warn when a *Map filter is used and the default <> action fires, because this indicates a possible missing or faulty rule. - improve error reporting for .vcp files. Either use recursive descent to delegate each value to the appropriate object or capture line and column with each value for error reporting. - in foo->p4 (where foo != p4), try to integrate - in p4->foo suppress revs with empty deltas immediately after an integrate so as to not introduce changless revs in other repos, including ->p4 conversions when --change-branch-rev-1 not specified. - give better diagnostics when the state file appears to be out of date: - when --continue is specified we can't. - when --continus is not specified and a revision already exists in the destination state files (but not in the destination rep). We could keep track of the last known change to the destination in the state file and probe to see if that file is at the indicated revision. Or just watch for revisions coming along that should be in the dest (according to the state file) but aren't. - Add a --skip-unchanged-revs option or VCP::Filter that skips unchanged revisions; all children of such a revision become children of that revision's parent. - Enable an "--append-revs" flag to allow a bootstrap file to be added. This is dangerous (there's no checking to be sure that the first new version is the first version after the existing version in the repository) but useful. This might be done already with --bootstrap specified on the source, but is completely untested - reports and queries against the state files to show: - head rev of each filebranch - what source revs ended up where (path & rev_id) - how branches got mapped - what the main branch_id is for foo->cvs imports. - Need to make the transfers more transactional, so we can recover from where we left off when something fails. We're part way there with the --continue support, but VCP needs to log what it's about to submit and sniff out how far the submittal got before it blew up. This would allow recovery by updating the state files to the correct state and not trying to double commit files. - Allow the state files to be checked in to the destination. Probably as text, in order to avoid sdbm byte ordering issues if they are checked out on a differently byte ordered system. - Perhaps allow keyword expansion, but convert the expanded texts so that they are no longer seens as RCS style keywords. This would allow imported files to have a "stamp of origin" in them. Would also need an option to leave the keywords in place in this case, since the user might presumably want expansion to work correctly in the new repository too. Suggested by Thomas Quinot . - Add a link checker to vcp html * An option to not bring over deleted files Steve James - A report destination that offers a preview of what a transfer will do, with summary and long views. - Limit the number of NtLkLy queries per command to prevent server lockup. Steve James (possibly URGENT, need to test). - VCP::Source::cvs guess what you mean when you specify a starting and/or ending tag using -r. The underlying cvs implementation emits all revisions for files that don't match the -r (assuming, I guess, that all of the file's is within the tagged limits). So VCP::Source::cvs looks through the files that did match the tags for the oldest and newest times of revisions and throws away revisions from files that *didn't* match and are older or newer than the oldest or newest. TODO: perhaps do this culling after change aggregation somehow so that an untagged file's rev that gets associated with a particular change gets included. * Set CVS_PASSFILE for all cvs invocations to prevent mucking with the users' current .cvspass - Use ptys to handle CVS login, if available. Recomend installing IO::Pty if needed but not installed. - finish off VCP::Source::cvs's simulation of the cvs (rlog) -r option - implement VCP::Source::cvs -d option when scanning files - PERHAPS checksum all non-binary files line by line, removing all \r's in order to reduce sensitivity to varying platform settings between the source and the destination. - allow VCP::{Source,Dest}::* to "sniff" at unknown directories / files to see if they can detect what kind of repository is there. This will make schemes optional, so tab completion will work again. - Efficiency - extend VCP::Source::cvs to build revisions directly from the RCS files, this will probably mean memorizing the offsets of the delta or full text chunk for each version in the RCS file, then applying them all as needed to get the desired version. They may need to be reversed as a speed hack since RCS files tend to store the most recent revision in full text and uses deltas from that to encode older revisions, and we'll probably want the oldest revision first. This means that we can build the more recent revisions from the older revisions by reversing the deltas as we apply them to build the older revisions, then apply those reversed deltas. Or something; not sure what's best here. - VCP::Source::revml should only keep on hand the versions it needs at each moment in order to conserve disk space. The problem with this is that the RevML may be coming from a non-seek()able byte stream, like STDIN, so we need to patch as we go. One alternative is to cache the revml off to the side and rescan it if this happens. Another is to only patch- as-you-go if the input is non-seek()able. NOTE: P'haps VCP::Source::revml and VCP::Source::cvs can share the RCS file scanner. NOTE: P'haps VCP::Source::revml and VCP::Source::cvs can share the an internal file revision format; RCS is bass ackwards for our needs. - Once a version has been used, it should be cleaned up out of the source directory unless it will be needed again. This used to be the way things were done, but it has been changed to support branching. STATUS: partially implemented. Revs now refer directly to their ancestor revs, and thus revs may be freed willy-nilly and needed revs won't disappear. TODO: Have each ::Dest do $r->previous( undef ) as aggressively as possible so that older revs will be garbage collected as needed. - in --continue mode, VCP::Source::p4 could do a p4 files to get all the source_filebranch_ids and then get the last_rev_in_filebranch for each, which would probably be a lot quicker than running a full filelog and throwing away most of the data (ie for a --continue on a large tree with lots of changes, but only one or two files have changed since the last export). * Consider offering (david d zuhn ) - benchmark different strategies for IPC::Run3's buffers, see if we can patch it to be faster by using a pipe. - the RevMapDB should be purged of any revisions descended from a revision being transferred. Right now, if you restore the set repository from an earlier backup and don't rewind the vcp_state directory, you will end up with a mix of RevMapDB entries from the prior transfer and the current. Fo now, a warning is generated. - Cleanup * create an init() phase after new() so new() can give immediate feedback on bogus command line options. As it is, sourcing a CVS pserver causes a password prompt before the dest even gets a crack at the options. - use parent/child nomenclature instead of previous/next - have VCP::Rev use a next pointer, perhaps in a %next, as VCP::Dest does, to ensure garbage collection. - Centralize underscorification of tag names for VSS. * factor the full sorting code out of VCP::Dest in to a VCP::Sort and only use it in VCP::Dest::revml. Modify t/01sort.t to suit. * remove all instances of launch_p4d from t/*.t and delete it from VCP::TestUtils; this is to increase the consistency of the test suite and prevent "fix it twice" issues. - Rejected - VCP::Source::p4 should be able to create and read a metadata dump as an option. Watch out for different schemas in different p4d versions. Q: Read the btree files directly? 'twould be faster and more space efficient. (rejected by Perforce so as to not tie VCP to p4d's schema; this may still be contributed as OSS by others, but will not be done by the core team any time soon unless the plans change) - VCP::Dest::p4 should write a metadata file directly, and be able to merge new data in to a destination's exported metafile for reimport. Q: Write the btree files directly? This would bypass any checking p4d does on recovering from a metadata file. (rejected by Perforce so as to not tie VCP to p4d's schema; this may still be contributed as OSS by others, but will not be done by the core team any time soon unless the plans change) - An option to prefix all labels with some user-defined string Steve James (this is no longer necessary, as vcp does not add its own labels by default).