cvs2p4 2.3,1 July 30, 2002 Release 2.3.1 of cvs2p4 includes a radically different approach to importing CVS history into Perforce, in order to provide a much faster conversion process. cvs2p4 1.x releases put data into Perforce by "replaying" all of the CVS changes into a live Perforce server. This newer version works by directly generating Perforce metadata, and linking (or copying) the RCS archives from the CVS repository directory directly into the Perforce file archive. If you have problems with this version, you can still get a copy of the older (and slower, but time-tested) version at ftp://ftp.perforce.com/perforce/utils/cvs2p4/cvs2p4-1.3.3.tar ==== INTRODUCTION This small set of tools provides a means for importing a CVS module into Perforce. It was originally developed for use at Network Appliance, to convert our product source code revision history from CVS into Perforce. As such it sprouted some NetApp-specific features suited to our special needs, but I have made an attempt to make these unobtrusive to the general user. Basically, it is patterned at a high level after the PVCS to Perforce converter available on the Perforce web site, doing the following steps during a conversion: - Scans the CVS repository to generate a metadata file; - Scans the metadata file to identify groups of RCS revisions that comprise Perforce changes; - Imports the revisions/log history into a Perforce depot, by directly generating Perforce metadata in "journal" format. (driven by the output of the previous phase); - Finally, (and optionally), generates a map of RCS revisions and the Perforce changes they belong to. cvs2p4 tries make the resultant Perforce depot look as if the work in CVS had been going on in Perforce. In particular, it attempts to create changes corresponding to the whole creation of new branches a la p4 integrate //depot/branchA/... //depot/branchB/... This is in contrast to rcstoperf.sh, which scattered the "integrates" corresponding of the creation of files on new branches into many changes (basically, according to when the file was actually first changed in the new branch). cvs2p4 also allows you to import only selected branches, and/or to map some branch other than the the CVS trunk to become the new "main" branch in Perforce. See the notes in the template config file ("test/config") for more information on these features. As of version 1.3, cvs2p4 will also import CVS symbolic version tags. Note: A CVS tagged revision will make it into Perforce labels ONLY when the revision is in fact present in the converted depot, subject to the branches selected for import. (See the notes for the "WANTLINES" variable in the config file). ==== MANIFEST After unpacking the distribution archive, use the MANIFEST script to verify that you have all of the pieces. The output should go something like this: $ MANIFEST MANIFEST Artistic README NEWS bin/genmetadata bin/genchanges bin/dochanges bin/dolabels bin/revmap lib/util.pl test/file,v test/dollar$file,v test/space file,v test/config test/runtest test/norm test/metadata.good test/lines.good test/changes.good test/p4_changes_-l.good test/p4_describe.good test/p4_filesat.good test/p4_labels.good All ok ==== REQUIREMENTS This stuff should work on any Unix host that supports: - Perl 5.x, with working dbm support (i.e., dbmopen()/dbmclose() work). The scripts assume that perl will be found via $PATH. It must be a perl5! Some people have reported problems that seem to be related to dbm limitations with some perls when converting very large repositories. I like implementations based on Berkeley-DB. - Perforce release 2002.1. Later Perforce releases may work, but since this script generates journal-format metadata directly, it may need to be changed in order to work correctly with other Perforce releases. ==== WHAT IT DOES This converter will import a CVS module into Perforce, preserving the branching structure seen in the RCS ,v file in the CVS repository, and translating them into Perforce branches within the depot. As it stands, it will only import RCS branches up through the highest numbered revisions on branches that have branch tags referring to them; thus, it will not necessarily bring *every* revision in the CVS module into Perforce, but *will* bring in every revision leading up to the current revision for every branch it imports. I think this is what most people will want; if not, hack away. Like the "rcstoperf.sh" converter available on the Perforce web site, it applies heuristics to try and identify multiple changes in CVS that are highly likely to comprise what would be seen as a single change in Perforce, and makes them appear as a single Perforce change. (The heuristics are: checked in by the same user, proximal in time, and bearing an identical log message). It deals correctly with files that are dead on the CVS trunk (I.e., where the RCS ,v files are in the "Attic/". The converter attempts to leave converted files in perforce with a sensible Perforce file type (See `p4 help filetypes` for a description of file tyeps in Perforce) after the conversion. However, due to limitations in RCS's notion of "file type" (the -k options, controlling keyword expansion), cvs2p4 must currently decide to import all "text" files as Perforce type "text" (text with no keyword expansion) or "ktext" (text with keyword expansion). This is controlled by the "$KTEXT" configuration option, which is on by default. Also note that binary files will be converted to Perforce type "binary+D"; the (unusual) "+D" is there because the converter works by using the existing RCS archive files directly; normally in perforce, filetype "binary" implies storage of complete revisions, rather than as RCS archives. Rest assured that "binary+D" is correct. The "UI" for the converter is not very slick, but for most people it's a one-time kind of tool anyway. Feel free to improve it if you are so inclined. While I am currently a Perforce employee, please understand that this is *not* presently officially supported by Perforce. It is supplied in hopes that somebody will find it useful (Or perhaps only entertaining :-). ==== TESTING I have included a *very* rudimentary automated test "suite", in the test/ directory. You can use this to verify that it seems to work in your environment. To run it: 1. Edit test/config, and change the lines # p4 command location (If other than "/usr/local/bin/p4") # $P4 = "/usr/local/bin/p4"; # p4 command location (If other than "/usr/local/bin/p4d") # $P4D = "/usr/local/bin/p4d"; # Perforce server we're using. # $P4PORT = "localhost:1680"; to reflect the actual location of your "p4" and "p4d" commands, and the server port that you are using. *** Note: Pervious versions of this tools allowed you to run the Perforce server on a different host than the one where the conversion tools were run. This is no longer the case; thus you should probably never change the "localhost" part of the P4PORT configuration setting, above. 2. Run the tests with test/runtest This should run all of the conversion scripts on the test CVS module (well, file - it's a one-file module!), and then verify a few things by querying the Perforce server after the conversion is complete. If everything goes well, the end of the output should be runtest: ok In this version, the converted CVS "module" consists of a very few files, but it does have a carefully constructed branching structure, intended to verify that the converter does the right stuff with respect to branching. ==== USAGE 1. Make a directory to hold all the working files for the conversion, and create a config file, starting with test/config as a template: $ mkdir convdir; cp test/config convdir Edit the convdir/config file to reflect your locale and intent. (See the comments in the config file). 2. Run bin/genmetadata: It takes a single argument - the name of the directory where the "config" file resides. (It will create all intermediate, temp, and working files under this directory.) $ bin/genmetadata convdir genmetadata: rm -rf convdir/logmsgs.dir convdir/logmsgs.pag ... . . (filenames of each file in the CVS module, as they are scanned) . ===== Lines referenced: chupa curly ha <- a list of branch tags encountered in the scan; larry also saved to convdir/lines. shemp xxx This reads cvsdir/config to get its marching orders, then scans the CVS module for all ,v and Attic/,v files, creating: convdir/metadata <- the extracted RCS/CVS metadata convdir/logmsgs.pag <- An ndbm database convdir/logmsgs.dir <- of the log messages convdir/lines <- A list of "codelines" (== branch tags) At this point, you may want to look at the list of branch tags encountered, (which was written to convdir/lines), edit the config file, setting WANTLINES to 1, and filling in the "<&1 | tee OUT dochanges> /bin/rm -f convdir/revmap.db ... dochanges> /bin/rm -f convdir/depotmap.db ... dochanges> /bin/rm -rf p4root && mkdir -p p4root dochanges> /bin/mkdir -p /home/rmg/web/richard_geiger/... dochanges> /bin/ln -s /home/rmg/web/richard_geiger/... ========== change group 1 ========== change group 2 ========== change group 3 . . . ========== change group 17 ========== change group 18 dochanges> cd /home/rmg/web/richard_geiger/... Recovering from dbmeta... dochanges> cd /home/rmg/web/richard_geiger/... Dumping to checkpoint... Basically, that's it. When this command finishes, your CVS module has been imported to Perforce, in the Perforce server database identified by the $P4ROOT configuration variable. The state of the resultant database is saved in a checkpoint file named $P4ROOT/checkpoint. NOTE: cvs2p4 does not create new RCS-format archives (,v files) under $P4ROOT; rather, it uses the existing RCS archives in the CVS tree directly. By defasult, does this by making a symbolic link named $P4ROOT/depot/IMPORT pointing to the $CVS_MODULE tree. If you'd rather have dochanges copy in the CVS module for you, set COPYIMPORT in the config file. 6. If you want to import labels from CVS tags, run $ bin/dolabels convdir make label: testlabel dolabels> cd p4root && /usr/local/bin/p4d -jr dblbls /home/rmg/web/richard_geiger/guest/richard_geiger/utils/cvs2p4_meta/p4root Recovering from dblbls... dolabels> cd p4root && rm -f checkpoint; /usr/local/bin/p4d -jd checkpoint /home/rmg/web/richard_geiger/guest/richard_geiger/utils/cvs2p4_meta/p4root Dumping to checkpoint... This step adds the symbolic tag information from the CVS archive (for "plain", non-branch tags) to the Perforce database identified by the $P4ROOT configuration variable. The state of the resultant database is saved in a checkpoint file named $P4ROOT/checkpoint. ** NOTE: This version of cvs2p4 does *not* create new RCS archives in ** $P4ROOT/depot/...; Rather, it creates a symbolic link ** "$P4ROOT/depot/IMPORT -> $CVS_MODULE"; i.e., the existing RCS ** archives form the CVS repository are used by the Perforce server, ** in place. If you'd rather have it make a _copy_ of the RCS archive ** files from your CVS repository, set "$COPYIMPORT = 1" in your ** config file. 7. If you want the RCS revision-to-Perforce change map, run: $ bin/revmap convdir Or, for the reverse mapping: $ bin/revmap -map rrevmap convdir ==== INCREMENTAL CONVERSIONS At this time, the recommended procedure for doing "incremental" conversions - i.e., combining multiple CVS repositories, or doing subsets of the CVS modules in a repository one at a time - is to do each as a new conversion (starting with change 1), and then to combine them as desired using the "perfmerge2.pl" tool. This is also a useful pattern when you want to combine some new chunk of CVS (or RCS) repository into an existing Perforce depot. perfmerge2.pl can be downloaded from: ftp://ftp.perforce.com/perforce/r02.1/tools/server/perfmerge2.pl You'll also need the P4-journal perl module: ftp://ftp.perforce.com/perforce/r02.1/tools/server/P4-Journal.tar.gz Use the verion in the directory correspondin to your Perforce server release; e.g., .../r02.1/... is for p4d 2002.1; .../r01.2/... for 2001.2, and so forth. *** For use with servers prior to 2001.1, the name will be "perfmerge.pl", and there will be no P4-Journal.tar.gz file. *** In order for this to work, you'll need to insure that there is no overlap in the namespaces of files, between your existing Perforce repository and the newly converted files. See the notes at the top of the perfmerg2.pl script. perfmerge2.pl can operate in different modes, with respect to the ordering of change numbers in the merged repositories. You can elect either - to have it renumber all of the merged changesets, so that the time-ordered property of all change numbers (both existing and newly-merged) is preserved; or, - to leave your existing changes remain numbered as they are, with the newly imported changed numbered from the next available change number, even though some of them may have taken place (in CVS) interleaved in time with your existing Perforce changes. Note that perfmerge2.pl only merges server metadata; you'll also need to manually copy the tree of RCS archive files from your newly converted $P4ROOT into your existing server's $P4ROOT. ==== SUPPORT I originally wrote and contributed this tool while working for Network Appliance in 1997. I now work for Perforce, and, while I _am_ chartered with supporting Open Source software (such as this) as part of my job, it must be understood that Perforce Software still does not officially support it. I (and Perforce Software) can make absolutely no warranty that this will be helpful or even nontoxic for you, nor make any guarantee that I will be able to provide support. On the other hand, I have been able to be help in supporting many users in the past, so it's worth a try! - Richard Geiger Open Source Engineer at Perforce opensource@perforce.com Note: because of my role at Perforce, it would be helpful if questions or requests for help with cvs2p4 be sent to the "opensource@perforce.com" address, as shown above. Thanks! (revised July 30, 2002, release 2.3.1)