package VCP::Source::vss ; =head1 NAME VCP::Source::vss - A VSS repository source =head1 SYNOPSIS vcp vss:project/... =head1 DESCRIPTION Source driver enabling L<C<vcp>|vcp> to extract versions form a vss repository. The source specification for VSS looks like: vss:filespec [<options>] C<filespec> may contain trailing wildcards, like C</a/b/...> to extract an entire directory tree (this is the normal case). NOTE: This does not support incremental exports, see LIMITATIONS. =head1 OPTIONS =over #=item --cd # #Used to set the VSS working directory. VCP::Source::vss will cd to this #directory before calling vss, and won't initialize a VSS workspace of #it's own (normally, VCP::Source::vss does a "vss checkout" in a #temporary directory). # #This is an advanced option that allows you to use a VSS workspace you #establish instead of letting vcp create one in a temporary directory #somewhere. This is useful if you want to read from a VSS branch or if #you want to delete some files or subdirectories in the workspace. # #If this option is a relative directory, then it is treated as relative #to the current directory. ## #=cut =item -V -V 5 -V 5~3 Passed to C<ss History>. =item undocheckout If set, VCP will undo users' checkouts when it runs in to the "File ... is checked out by ..." error. This error occurs when scanning metadata for a file which is checked out by somebody and there is also a deleted file of the same name. NOTE: The VSS account VCP uses may need administrative prividges to perform UndoCheckout on files checked out by some other user. =back =head2 Files that aren't tagged VSS has one peculiarity that this driver works around. If a file does not contain the tag(s) used to select the source files, C<vss log> outputs the entire life history of that file. We don't want to capture the entire history of such files, so L<VCP::Source::vss> goes ignores any revisions before and after the oldest and newest tagged file in the range. =head1 LIMITATIONS Many and various. VSS, aside from its "normal" level of database corruption that many sites either deal with regularly or manage to ignore, also has many reporting and, from what I can tell, data model flaws that make it challenging to figure out what happened when. =cut $VERSION = 1.2 ; # Removed docs for -f, since I now think it's overcomplicating things... #Without a -f This will normally only replicate files which are tagged. This #means that files that have been added since, or which are missing the tag for #some reason, are ignored. # #Use the L</-f> option to force files that don't contain the tag to be #=item -f # #This option causes vcp to attempt to export files that don't contain a #particular tag but which occur in the date range spanned by the revisions #specified with -r. The typical use is to get all files from a certain #tag to now. # #It does this by exporting all revisions of files between the oldest and #newest files that the -r specified. Without C<-f>, these would #be ignored. # #It is an error to specify C<-f> without C<-r>. # #exported. use strict ; use Carp ; use File::Basename; use Regexp::Shellish qw( :all ) ; use VCP::Rev ; use VCP::Debug qw(:debug ); use VCP::Logger qw( lg pr BUG pr_doing pr_done ); use VCP::Source ; use VCP::Utils qw( escape_filename empty start_dir_rel2abs ); use VCP::Utils::vss ; use base qw( VCP::Source VCP::Utils::vss ) ; use fields ( 'VSS_CUR', ## The current change number being processed 'VSS_IS_INCREMENTAL', ## Hash of filenames, 0->bootstrap, 1->incremental 'VSS_INFO', ## Results of the 'vss --version' command and VSSROOT 'VSS_LABEL_CACHE', ## ->{$name}->{$rev} is a list of labels for that rev 'VSS_LABELS', ## Array of labels from 'p4 labels' 'VSS_MAX', ## The last change number needed 'VSS_MIN', ## The first change number needed # 'VSS_WORK_DIR', ## working directory set via --cd option 'VSS_UNDOCHECKOUT', ## Whether or not to undocheckout when the ## "File ... is checked out" error occurs 'VSS_VER_SPECS', ## An ARRAY of revision specs to pass to ## `ss History`. undef if there are none. 'VSS_NAME_REP_NAME', ## A mapping of names to repository names 'VSS_NEEDS_BASE_REV', ## What base revisions are needed. Base revs are ## needed for incremental (ie non-bootstrap) updates, ## which is decided on a per-file basis by looking ## at VCP::Source::is_bootstrap_mode( $file ) and ## the file's rev number (ie does it end in .1). 'VSS_HIGHEST_VERSION', ## A HASH keyed on filename that contains the ## last rev_id seen for a file. This allows ## file deletions (which aren't tracked by ## VSS in a file's history) to be given a ## pretend revision number. This value includes ## any VSS revisions we ignore because they ## are merely label actions. 'VSS_HIGHEST_VERSION_TO_SEND', ## This is like VSS_HIGHEST_VERSION but ## does *not* include the ignored VSS revisions. ## So it will be smaller than VSS_HIGHEST_VERSION ## whenever labels are involved. 'VSS_REV_ID_OFFSET', ## After a busy day processing a deleted file, ## it's time to relax and process the not-deleted ## file of the same name. In order to keep ## from reusing the same version numbers for ## the not-deleted file, this variable contains ## an offset to add to the revisions. It's the ## value of VSS_HIGHEST_VERSION reached while ## reading the deleted file. 'VSS_CURRENT_PROJECT', ## The last ss cp parameter we issued. 'VSS_FILES', ## We need to scan VSS for a list of files so we ## can do wildcard processing. This is done with ## a VCP::FilesDB object. 'VSS_BRANCHED_FROM', ## Cache of what files are branched from what ## other files. Each HASH key is an absolute ## VSS path to a file in lowercase. ## Each element is a ## RevML id (/path/to/file#5) of the parent ## version. ## Log file parsing state. 'VSS_LOG_FILE_DATA', ## The data that applies to the file for which ## the history log is being parsed. 'VSS_LOG_REV_DATA', ## Multiple VSS revisions can get compressed ## in to a single VCP revision in order to ## associate labels with the last actually ## changed version. To do this, the parser ## keeps accumulating data in this HASH ## until it finds a revision with an action ## other than "Labeled". The parser works ## from most recent revision to oldest and, ## may need to go past a revision specification ## that was given on the command line. This ## is a class data member so that repeated calls ## to the history command may be made to find ## a committable offense. 'VSS_LOG_OLDEST_VERSION', ## The oldest rev parsed for this file. ) ; sub new { my $class = shift ; $class = ref $class || $class ; my VCP::Source::vss $self = $class->SUPER::new( @_ ) ; ## Parse the options my ( $spec, $options ) = @_ ; unless ( empty $spec ) { ## Ignore leading / and $/ $spec =~ s{^\$?/*}{}; ## Make it look like a Unix path. $spec =~ s{\\}{/}g; $self->parse_vss_repo_spec( $spec ); } $self->parse_options( $options ); return $self ; } sub options_spec { my VCP::Source::vss $self = shift; return ( $self->SUPER::options_spec, "undocheckout" => \$self->{VSS_UNDOCHECKOUT}, # "cd=s" => \$self->{VSS_WORK_DIR}, # "V=s" => sub { # shift; # push @{$self->{VSS_VER_SPECS}}, "-V" . shift if @_; # return map substr( $_, 2 ), @{$self->{VSS_VER_SPECS}}; # }, ); } sub init { my VCP::Source::vss $self= shift ; $self->SUPER::init; ## Set default repo_id. $self->repo_id( "vss:" . $self->repo_server ) if empty $self->repo_id && ! empty $self->repo_server ; my $files = $self->repo_filespec ; $self->deduce_rev_root( $files ) unless defined $self->rev_root; ## rev_root should be "$"-less. $self->rev_root( $1 ) if $self->rev_root =~ m{\A\$(.*)}; # my $work_dir = $self->{VSS_WORK_DIR}; # unless ( defined $work_dir ) { $self->create_vss_workspace ; # } # else { # $self->work_root( start_dir_rel2abs $work_dir ) ; # $self->command_chdir( $self->work_path ) ; # } { ## Dirty trick: send a known bad parm *just* to get ss.exe to ## print it's banner without popping open a help screen. ## we capture and ignore stderr because it's expected. $self->ss( [ "help", "/illegal arg" ], undef, \my $out, \my $ignored_err, { ok_result_codes => [0..255], }, ); $self->{VSS_INFO} = $out; } $self->files->delete_db; $self->files->open_db; if ( $self->{VSS_UNDOCHECKOUT} ) { $self->command_stderr_filter( sub { my ( $err_text_ref ) = @_; if ( $$err_text_ref =~ s{^File .* is checked out by .*\r?\n}{} ) { $self->throw_undocheckout_and_retry; } } ); } } =item files Returns a reference to the FilesDB for this backend and repository. Creates an empty one if need be. This is like VCP::Dest::files() but most other sources do not need to do this, so these are =cut sub files { my VCP::Source::vss $self = shift ; return $self->{VSS_FILES} ||= do { require VCP::FilesDB; $self->{VSS_FILES} = VCP::FilesDB->new( TableName => "source_files", StoreLoc => $self->_db_store_location, ); } } sub is_incremental { my VCP::Source::vss $self= shift ; my ( $file, $first_rev ) = @_ ; $first_rev =~ s/\.\d+//; ## Trim down <delete /> rev_ids my $bootstrap_mode = $first_rev <= 1 || $self->is_bootstrap_mode( $file ) ; return ! $bootstrap_mode ; } sub denormalize_name { my VCP::Source::vss $self = shift ; return '/' . $self->SUPER::denormalize_name( @_ ) ; } sub handle_header { my VCP::Source::vss $self = shift ; my ( $header ) = @_ ; $header->{rep_type} = 'vss' ; $header->{rep_desc} = $self->{VSS_INFO} ; $header->{rev_root} = $self->rev_root ; $self->dest->handle_header( $header ) ; return ; } sub ss_get { ## A specialized method for this so that we can snatch the file we get ## to it's new filename before it gets redeleted when the Get is being ## performed by _swap_in_deleted_file_and(). Otherwise SS.EXE helpfully ## deletes it, whether or not -S- is passed. my VCP::Source::vss $self = shift ; my ( $r, $rev_id, $dir, $fn ) = @_; $self->ss( [ "Get", "\$/" . $r->vcp_source_scm_fn, "-V" . $rev_id, "-GN", ## Newlines only, please "-GL" . $dir, ], ); my $temp_fn = "$dir/" . fileparse( $r->vcp_source_scm_fn ); die ## Should be a BUG, but we don't want to abort the Recover. "$temp_fn (", $r->vcp_source_scm_fn, ") does not exist after Get" unless -e $temp_fn; rename "$temp_fn", "$dir/$fn" or die "$! renaming $temp_fn to $dir/$fn\n"; } sub get_source_file { my VCP::Source::vss $self = shift ; my VCP::Rev $r ; ( $r ) = @_ ; debug "getting ", $r->as_string if debugging; die "can't check out ", $r->as_string, "\n" unless $r->is_base_rev || $r->action eq "add" || $r->action eq "edit"; my $wp = $self->work_path( "revs", $r->source_name, $r->source_rev_id ) ; $r->work_path( $wp ) ; $self->mkpdir( $wp ) ; my ( $fn, $dir ) = fileparse( $wp ); confess "Shouldn't be get_rev()ing a rev with no rev_id" unless defined $r->rev_id; $dir =~ s{[\\/]\z}{}; ## Trailing slashes confuse SS.EXE's dequoting by making it think ## the trailing slash is an escape character so it will take its ## trailing quote literally. my $do_normal_get = 1; my $vcp_rev_id = $r->rev_id; if ( $self->vss_file_is_deleted( $r->vcp_source_scm_fn ) ) { my $offset = $self->{VSS_REV_ID_OFFSET}->{$r->vcp_source_scm_fn}; if ( VCP::Rev->cmp_id( $vcp_rev_id, $offset ) <= 0 ) { $self->_swap_in_deleted_file_and( $r->vcp_source_scm_fn, "ss_get", ( $r, $vcp_rev_id, $dir, $fn ) ) ; $do_normal_get = 0; } else { $vcp_rev_id = int 0.5 + $vcp_rev_id - $offset; } } $self->ss_get( $r, $vcp_rev_id, $dir, $fn ) if $do_normal_get; return $wp; } ## History report Parser states ## The code below does things like grep for "commit" and "skip to next" ## in these strings. Plus, they make debug output easier to read. use constant SKIP_TO_NEXT => "skip to next"; use constant SKIP_TO_NEXT_COMMIT_AT_END => "skip to next and commit at end"; use constant ENTRY_START => "entry start"; use constant READ_ACTION => "read action"; use constant READ_COMMENT_AND_COMMIT => "read comment and commit"; use constant READ_REST_OF_COMMENT_AND_COMMIT => "read rest of comment and commit"; sub _get_file_metadata { my VCP::Source::vss $self = shift ; my ( $filename ) = @_; my $ss_fn = "\$/$filename"; my $properties; $self->ss( [ "Properties", $ss_fn ], undef, \$properties ); debug "[$properties]" if debugging; my ( $filetype ) = $properties =~ /^Type:\s+(\S+)$/m or BUG "Can't parse filetype from '$properties'"; $filetype = lc $filetype; my $store_head_only = $properties =~ /^Store only latest version:\s+Y/mi; my $tmp_f; my $result = 1; ## Clear the parser state. $self->{VSS_LOG_OLDEST_VERSION} = undef ; $self->{VSS_LOG_REV_DATA} = undef; $self->{VSS_LOG_FILE_DATA} = { Name => $filename, Type => $filetype, HeadOnly => $store_head_only, }; ## The interplay between VSS_VER_SPECS and HeadOnly means that we ## we can't be sure that the most recent reported version ## is actually stored in the repository (it might be version 5 of 10, ## say). So we can't pass -#1 to ss History, we have to tell the parser ## to bail after the first version it reads. $self->ss( [ "History", "\$/$filename", @{$self->{VSS_VER_SPECS} || []}, ], undef, sub { $self->parse_history_output( @_, $store_head_only ) }, $self->{VSS_VER_SPECS} ? ( stderr_filter => sub { my ( $err_text_ref ) = @_ ; $$err_text_ref =~ s{^Version not found\r?\n\r?}[$result = 0; '' ;]mei ; }, ) : () ); ## If the history ended on a "Labeled" rev, it will not have ## been saved off as a real rev yet. ## I think this should only happen if the -V ## option was used. $self->_add_rev_from_log_parser if $self->{VSS_LOG_REV_DATA}; ## If the oldest revision not found was not a branch founding ## revision, then VSS_LOG_OLDEST_VERSION will be set. my $oldest = $self->{VSS_LOG_OLDEST_VERSION}; if ( defined $oldest && $self->is_incremental( $filename, $oldest ) && ! $store_head_only ) { debug "scanning back to base rev" if debugging; $oldest =~ s/\.\d+//; # ignore faked-up revs. ## Walk back and find the next real version (ie not a labelled ## version. This should exist in the destination repository, ## even if it's not the head revision. while ( --$oldest && $oldest ) { if ( $oldest <= $self->{VSS_REV_ID_OFFSET}->{$filename} ) { $self->_swap_in_deleted_file_and( $filename, "_parse_a_rev", ( $filename, $oldest ) ); } else { $self->_parse_a_rev( $filename, $oldest ); } if ( !$self->{VSS_LOG_REV_DATA} ) { ## Must have found a real edit. debug "converting to base_rev", $self->last_rev->as_string if debugging; $self->last_rev->base_revify; last; } } } if ( keys %{$self->{VSS_LOG_REV_DATA}} ) { require Data::Dumper; local $Data::Dumper::Indent = 1; local $Data::Dumper::Quotekeys = 0; local $Data::Dumper::Terse = 1; BUG( "Data left over from log parse\n", Data::Dumper::Dumper( $self->{VSS_LOG_REV_DATA} ) ); } return $result; } { ## This routine is used once per operation so that the source file is ## deleted immediately after each operation so that the source repo ## is always put back in its proper state in case we exit between ## operations. This is inefficient, but conservative. ## TODO: Allow a fast-but-dangerous option to make this maintain state ## for each file and only clean up the repository at the end. my $pending_swap_out; ## pending_swap_out is set so that END{} can clean up... END { $pending_swap_out->() if ! empty $pending_swap_out; } sub _swap_in_deleted_file_and { my VCP::Source::vss $self = shift ; my ( $filename, $method, @args ) = @_; my $ss_fn = "\$/$filename"; my $renamed_active; if ( $self->vss_file_is_active( $filename ) ) { my $i = ""; while (1) { $renamed_active = "$ss_fn.vcp_bak$i"; ( my $key = $renamed_active ) =~ s/^\$?[\\\/]//g; last unless ($self->files->get( [ $key ] ))[0]; $i ||= 0; ++$i; } $self->ss( [ "Rename", $ss_fn, $renamed_active ] ); } my $result; $self->ss( [ "Recover", $ss_fn ], ); $pending_swap_out = sub { $pending_swap_out = undef; my $ok = eval { $self->ss( [ "Delete", $ss_fn ] ); 1; }; my $x = $ok ? "" : $@; if ( ! empty $renamed_active ) { if ( ! eval { $self->ss( [ "Rename", $renamed_active, $ss_fn ] ); 1 } ) { $@ = "$x$@"; return 0; }; } return $ok; }; my $ok = eval { $result = $self->$method( @args ); 1 }; my $x = $ok ? "" : $@; $ok = $pending_swap_out->() && $ok; die $x.$@ unless $ok; return $result; } } sub copy_revs { my VCP::Source::vss $self = shift ; ## Get a list of all files we need to worry about $self->get_vss_file_list( $self->repo_filespec ); pr_doing "extracting VSS metadata: ", { Expect => 0+$self->vss_files }; for my $filename ( $self->vss_files ) { pr_doing; $self->{VSS_REV_ID_OFFSET}->{$filename} = 0; my $found_deleted; if ( $self->vss_file_is_deleted( $filename ) ) { ## Create deletion revisions for deleted files. $found_deleted = $self->_swap_in_deleted_file_and( $filename, "_get_file_metadata", $filename ); $self->{VSS_REV_ID_OFFSET}->{$filename} = $self->{VSS_HIGHEST_VERSION}->{$filename}; my $vss_name = "/$filename"; my $norm_name = $self->normalize_name( $filename ); my $rev_id = "$self->{VSS_REV_ID_OFFSET}->{$filename}.1"; my $branch_id = (fileparse $vss_name )[1]; my VCP::Rev $r = VCP::Rev->new( id => "$vss_name#$rev_id", name => $norm_name, vcp_source_scm_fn => $filename, source_name => $norm_name, source_filebranch_id => $vss_name, branch_id => $branch_id, source_branch_id => $branch_id, source_repo_id => $self->repo_id, action => "delete", ## Make up a fictional rev number that will allow the ## receiver's sort algorithm to put this delete in the ## right place and that will be documented in the ## receiving repository as a label. rev_id => $rev_id, source_rev_id => $rev_id, ## Deletes are not logged, no user data, time, etc. !empty( $self->{VSS_HIGHEST_VERSION_TO_SEND}->{$filename} ) ? ( previous_id => "$vss_name#$self->{VSS_HIGHEST_VERSION_TO_SEND}->{$filename}" ) : (), ) ; my $add_it = 1; if ( $self->continue && $self->dest ) { my $previous_rev_id = $self->dest->last_rev_in_filebranch( $self->repo_id, $vss_name, ); my $cmp = defined $previous_rev_id ? VCP::Rev->cmp_id( $previous_rev_id, $rev_id ) : -1; $add_it = $cmp < 0; } $self->send_rev( $r ) if $add_it; } my $found_active; if ( $self->vss_file_is_active( $filename ) ) { my $tmp_ver_spec; my $last_deleted; if ( $found_deleted ) { $last_deleted = $self->last_sent; ## If we were looking for a specific version and found it ## back in the deleted time, make sure we also get all ## the revs from the active file. ## THIS ASSUMES WE'RE NOT SEARCHING FOR A RANGE. ## Can't local()ize a p-hash. $tmp_ver_spec = $self->{VSS_VER_SPECS}; $self->{VSS_VER_SPECS} = undef; } $found_active = $self->_get_file_metadata( $filename ); $self->{VSS_VER_SPECS} = $tmp_ver_spec if $found_deleted; if ( $last_deleted && $found_active ) { my $r = $self->last_sent; $r->previous_id( $last_deleted->id ) unless $r == $last_deleted; } } pr join " ", @{$self->{VSS_VER_SPECS}}, "did not match any revisions of $filename" if $self->{VSS_VER_SPECS} && ! ( $found_deleted || $found_active ); } ## Placeholders should not have types (the destinations ## carry the parent's type over), but placeholders with no ## parents need to be converted in to edit operations because ## they are founding branches. ## TODO: should placeholders with no parents be "add" instead of "edit"? for my $r ( $self->queued_revs ) { ## We assume that any unfound source branches are not wanted and ## that the user intends to export a branch without its roots. if ( $self->id_seen( $r->previous_id ) ) { $r->type( undef ) if $r->action eq "placeholder"; } else { $r->previous_id( undef ); $r->action( 'edit' ) if $r->action eq "placeholder"; } } $self->send_revs; pr_done; pr "found ", $self->sent_rev_count, " revisions"; } # Here's a typical history # ############################################################################### ##D:\src\vcp>ss history #History of $/90vss.t ... # #***************** Version 9 ***************** #User: Admin Date: 3/05/02 Time: 9:32 #readd recovered # #***** a_big_file ***** #Version 3 #User: Admin Date: 3/05/02 Time: 9:32 #Checked in $/90vss.t #Comment: comment 3 # # #***** binary ***** #Version 3 #User: Admin Date: 3/05/02 Time: 9:32 #Checked in $/90vss.t #Comment: comment 3 # # #***************** Version 8 ***************** #User: Admin Date: 3/05/02 Time: 9:32 #readd deleted # #***** binary ***** #Version 2 #User: Admin Date: 3/05/02 Time: 9:32 #Checked in $/90vss.t #Comment: comment 2 # # #***************** Version 7 ***************** #User: Admin Date: 3/05/02 Time: 9:32 #readd added # #***** a_big_file ***** #Version 2 #User: Admin Date: 3/05/02 Time: 9:32 #Checked in $/90vss.t #Comment: comment 2 # # #***************** Version 6 ***************** #User: Admin Date: 3/05/02 Time: 9:32 #$del added # #***************** Version 5 ***************** #User: Admin Date: 3/05/02 Time: 9:32 #binary added # #***************** Version 4 ***************** #User: Admin Date: 3/05/02 Time: 9:31 #$add added # #***************** Version 3 ***************** #User: Admin Date: 3/05/02 Time: 9:31 #a_big_file added # #***************** Version 2 ***************** #User: Admin Date: 3/05/02 Time: 9:31 #$a added # #***************** Version 1 ***************** #User: Admin Date: 3/05/02 Time: 9:31 #Created # # #D:\src\vcp>ss dir /r #$/90vss.t: #$a #$add #$del #a_big_file #binary #readd # #$/90vss.t/a: #$deeply # #$/90vss.t/a/deeply: #$buried # #$/90vss.t/a/deeply/buried: #file # #$/90vss.t/add: #f1 #f2 #f3 # #$/90vss.t/del: #f4 # #13 item(s) # #D:\src\vcp> # ############################################################################### sub _parse_a_rev { my ( $self, $fn, $rev_id ) = @_; $rev_id -= $self->{VSS_REV_ID_OFFSET}->{$fn} if $rev_id > $self->{VSS_REV_ID_OFFSET}->{$fn}; $self->ss( [ "History", "\$/$fn", "-V$rev_id", "-#1" ], undef, sub { $self->parse_history_output( @_ ) } ); ## If the history ended on a "Labeled" rev, it will not have ## been saved off as a real rev yet. ## I think this should only happen if the -V ## option was used. $self->_add_rev_from_log_parser if $self->{VSS_LOG_REV_DATA}; } ## Called each time a new revision is reached and there's no place to ## catch the information. sub _init_log_rev_data { my VCP::Source::vss $self = shift; debug "initializing new rev" if debugging; return $self->{VSS_LOG_REV_DATA} = { %{$self->{VSS_LOG_FILE_DATA}}, }; } sub _add_rev_from_log_parser { my ( $self ) = @_; debug "adding revision" if debugging; my $p = $self->{VSS_LOG_REV_DATA}; BUG "trying to add a revision when none was parsed" unless $p; $self->{VSS_LOG_REV_DATA} = undef; $p->{Comment} = '' unless defined $p->{Comment}; $p->{Comment} =~ s/\r\n|\n\r/\n/g ; chomp $p->{Comment}; chomp $p->{Comment}; my $added_it = $self->_add_rev( $p ); my $name = $p->{Name}; ## This is the version number without the additional label ## versions. my $v = $p->{Version}; $self->{VSS_HIGHEST_VERSION_TO_SEND}->{$name} = $v if $added_it && ( ! defined $self->{VSS_HIGHEST_VERSION_TO_SEND}->{$name} || $v > $self->{VSS_HIGHEST_VERSION_TO_SEND}->{$name} ); ## VSS_HIGHEST_VERSION_TO_SEND is used to generate the previous_id ## for revisions. If we don't end up queuing a revision, $added_it ## will be false. In this case, don't set VSS_HIGHEST_VERSION_TO_SEND ## because we don't want to refer to unsent revisions. $v += @{ $p->{Labels} || [] }; $self->{VSS_HIGHEST_VERSION}->{$name} = $v if ! defined $self->{VSS_HIGHEST_VERSION}->{$name} || $v > $self->{VSS_HIGHEST_VERSION}->{$name}; } sub parse_history_output { my VCP::Source::vss $self = shift; my ( $input, $exit_after_head_rev ) = @_ ; my $state = SKIP_TO_NEXT; my $p = $self->{VSS_LOG_REV_DATA}; debug "\$exit_after_head_rev set" if debugging && $exit_after_head_rev; local $_ ; while ( <$input> ) { if ( debugging ) { my $foo = $_; chomp $foo; debug "[$foo] $state\n"; } if ( /^\*{17} Version (\d+) +\*{17}/ ) { if ( $p && "commit" eq substr $state, -6 ) { $self->_add_rev_from_log_parser; return if $exit_after_head_rev; } $state = ENTRY_START; $p = $self->_init_log_rev_data unless $self->{VSS_LOG_REV_DATA}; ## This will overwrite the newer/higher version number ## with the lower/older one until we reach the check-in ## we want $self->{VSS_LOG_OLDEST_VERSION} = $p->{Version} = $1; next; } if ( /^\*{5}\s+(.*?)\s+\*{5}$/ ) { if ( $p && "commit" eq substr $state, -6 ) { $self->_add_rev_from_log_parser; return if $exit_after_head_rev; } $state = ENTRY_START; $p = $self->_init_log_rev_data unless $self->{VSS_LOG_REV_DATA}; next; } next if 0 == index $state, SKIP_TO_NEXT; if ( $state eq ENTRY_START ) { if ( /^User:\s+(.*?)\s+Date:\s+(.*?)\s+Time:\s+(\S+)/ ) { ## Store these aside in case they're for the next VCP::Rev ## (which we can only tell when reading the action). $p->{User}= $1; $p->{Date}= $2; $p->{Time}= $3; $state = READ_ACTION; next; } if ( /^Label:\s*"([^"]+)"/ ) { ## Unshift because we're reading from newest to oldest yet ## we want oldest first so vss->vss is relatively consistent unshift @{$p->{Labels}}, $1; next; } } if ( $state eq READ_ACTION ) { if ( /Labeled/ ) { ## It's a label-add only, ignore the rest. ## for incremental exports, we'll need to commit at the ## end of the log if the last thing was a "Labeled" ## version. We don't want to commit after each "Labeled" ## because we want to aggregate labels. $state = SKIP_TO_NEXT_COMMIT_AT_END; next; } if ( /Rolled back/ ) { ## This could be any number of things: ## * Rollback ## * Rollback-before-Branch ## * Share -V ## * Share -V followed by Branch ## * Other things I don't understand ## We should figure out which one, but I'm not sure ## how to differentiate these. For now, I'm assuming ## that it's a branch creation. my $previous_id = eval { $self->branched_from( $p->{Name} ) }; if ( $previous_id ) { ## Guess that it's a branch operation that VSS is hiding ## from us. Hope the user didn't *really* issue a ## Rollback. pr "assuming Rollback on branch is Branch point\n", " Parent: \$$previous_id\n", " Child: \$/$p->{Name}#$p->{Version}"; $p->{PreviousId} = $previous_id; goto BranchFound; } $state = SKIP_TO_NEXT_COMMIT_AT_END; next; } if ( /Branched/ ) { $state = SKIP_TO_NEXT_COMMIT_AT_END; $p->{PreviousId} = $self->branched_from( $p->{Name} ); BranchFound: $p->{Action} = "placeholder"; ## copy_revs might convert this back from a placeholder to an ## edit if the source of the branch is not available. ## Prevent the caller from searching back for a base ## revision. ## TODO: Allow a project with branched files to be extracted ## with the branch point being bootstrapped. $self->{VSS_LOG_OLDEST_VERSION} = undef; ## Ignore all history before the branch, it's just ## bleedthrough from the parent. ## TODO: deal properly with shared history before a branch. ## This may require noting the branch point and scrolling ## back to the beginning creating placeholders over and ## over again as we do with dual-labelled CVS file branches. return; } if ( /^(Checked in .*|Created|.* recovered)\r?\n/ ) { $state = READ_COMMENT_AND_COMMIT; $p->{Action} = "edit"; next; } } if ( $state eq READ_COMMENT_AND_COMMIT ) { if ( s/Comment: // ) { $p->{Comment} = $_; $state = READ_REST_OF_COMMENT_AND_COMMIT; next; } next unless /\S/; } if ( $state eq READ_REST_OF_COMMENT_AND_COMMIT ) { $p->{Comment} .= $_; next; } require Data::Dumper; local $Data::Dumper::Indent = 1; local $Data::Dumper::Quotekeys = 0; local $Data::Dumper::Terse = 1; BUG "unhandled VSS log line '$_' in state '$state' for:\n", Data::Dumper::Dumper( \%$p ); } if ( 0 <= index $state, "commit" ) { $self->_add_rev_from_log_parser; return if $exit_after_head_rev; } } # Here's a (probably out-of-date by the time you read this) dump of the args # for _add_rev: # ############################################################################### #$file = { # 'WORKING' => 'src/Eesh/eg/synopsis', # 'SELECTED' => '2', # 'LOCKS' => 'strict', # 'TOTAL' => '2', # 'ACCESS' => '', # 'RCS' => '/var/vss/vssroot/src/Eesh/eg/synopsis,v', # 'KEYWORD' => 'kv', # 'RTAGS' => { # '1.1' => [ # 'Eesh_003_000', # 'Eesh_002_000' # ] # }, # 'HEAD' => '1.2', # 'TAGS' => { # 'Eesh_002_000' => '1.1', # 'Eesh_003_000' => '1.1' # }, # 'BRANCH' => '' #}; #$rev = { # 'DATE' => '2000/04/21 17:32:16', # 'MESSAGE' => 'Moved a bunch of code from eesh, then deleted most of it. #', # 'STATE' => 'Exp', # 'AUTHOR' => 'barries', # 'REV' => '1.1' #}; ############################################################################### ## Each rev needs to be dealt with in one of three ways: send it as-is ## as-is--which is always true in non-incremental mode, discard it, send ## it as a base revision, or send it as a new revision. This function ## decides which to do. Returns "base rev" if it needs to be sent as a ## base revision, some other TRUE value if it needs to be sent as-is, or ## a FALSE value if it should be ignored. ## ## --continue mode needs to discard revisions we've sent before except ## when sending the last revision that we've sent before. Those need ## to be sent as base revisions. ## sub _filter_rev { my VCP::Source::vss $self = shift ; my ( $vss_name, $filename, $rev_id, $action ) = @_; BUG "No destination set" unless $self->dest; return "send it" unless $self->continue; my $previous_rev_id = $self->dest->last_rev_in_filebranch( $self->repo_id, $vss_name, ); my $cmp = defined $previous_rev_id ? VCP::Rev->cmp_id( $rev_id, $previous_rev_id ) : 1; return 1 if $cmp > 0; ## Send it: never sent it before return undef if $cmp < 0; ## Discard it: it's been sent before ## We may need to send it as a base rev: it was already sent over ## once, so we won't send it whole. If we're bootstrapping this ## file, we don't send a bootstrap. return undef if $self->is_bootstrap_mode( $filename ); ## If this is a placeholder revision, don't resend it no matter what. ## The branch has already been created at the destination if need be ## TODO: perhaps the placeholder's predecessor should be sent as ## a base rev? return undef if $action eq "placeholder"; return "base rev"; } sub _add_rev { my VCP::Source::vss $self = shift ; my ( $rev_data, $is_base_rev ) = @_ ; my $filename = $rev_data->{Name}; my $vss_name = "/$filename"; my $rev_id = $rev_data->{Version} + $self->{VSS_REV_ID_OFFSET}->{$filename}; my $action = $rev_data->{Action}; my $send_mode = $self->_filter_rev( $vss_name, $filename, $rev_id, $action ); return unless $send_mode; my $norm_name = $self->normalize_name( $filename ); my $branch_id = (fileparse $vss_name )[1]; $rev_data->{Type} ||= "text"; my VCP::Rev $r = VCP::Rev->new( id => "$vss_name#$rev_id", vcp_source_scm_fn => $filename, name => $norm_name, source_name => $norm_name, source_filebranch_id => $vss_name, branch_id => $branch_id, source_branch_id => $branch_id, source_repo_id => $self->repo_id, rev_id => $rev_id, source_rev_id => $rev_id, previous_id => $rev_data->{PreviousId}, type => $rev_data->{Type}, $send_mode ne "base rev" ? ( action => $action, time => $self->parse_time( $rev_data->{Date} . " " . $rev_data->{Time} ), user_id => $rev_data->{User}, comment => $rev_data->{Comment}, state => $rev_data->{STATE}, labels => $rev_data->{Labels}, ) : (), ); $self->{VSS_NAME_REP_NAME}->{$rev_data->{Name}} = $rev_data->{RCS} ; $self->set_last_rev_in_filebranch_previous_id( $r ); if ( $r->is_placeholder_rev ) { $self->queue_rev( $r ); } elsif ( $self->id_seen( $r->id ) ) { pr "can't send rev a second time (corrupt VSS repository?): ", $r->as_string; } else { $self->send_rev( $r ) ; } return 1; } sub branched_from { my VCP::Source::vss $self = shift ; my ( $filename ) = @_; my $fn = "\$/$filename"; $fn = lc $fn unless $self->case_sensitive; $self->ss( [ "Paths", $filename ], undef, sub { $self->parse_paths_output( @_ ) }, ) unless exists $self->{VSS_BRANCHED_FROM}->{$fn}; # BUG "can't find parent for '$filename'" return undef unless exists $self->{VSS_BRANCHED_FROM}->{$fn}; return $self->{VSS_BRANCHED_FROM}->{$fn}; } ## Output looks like: ## ## Showing development paths for $/revml2vss/main-branch-1/branched... ## ## bar ## $/revml2vss/main ## bar (Branched at version 4) ## $/foo ## ## branched (Branched at version 2) ## > $/revml2vss/main-branch-1 ## ## We ignore the ">" position indicator. ## ## sub parse_paths_output { my VCP::Source::vss $self = shift ; my ( $input ) = @_ ; my $l = <$input>; BUG "expected 'Showing development...' from Paths, not '$l'" unless $l =~ /^Showing development/; $l = <$input>; BUG "expected Paths output line 2 to be blank, not '$l'" unless $l =~ /^\r?\n/; my $last_indent_length = 0; my $parent_full_fn; my $cur_fn; my $cur_branched_at; my $first_full_fn; local $_ ; while ( <$input> ) { if ( debugging ) { my $foo = $_; chomp $foo; debug "[$foo]\n"; } next if /\A\s*\z/; my ( $indent, $content ) = /^(>?\s+)(\S.*?)\r?\n/ or BUG "in Path output, can't parse line '$_'"; my $cur_indent = length $indent; BUG "in Path output, unexpected outdent from $cur_indent to ", length $indent, " in '$_'" if $cur_indent < $last_indent_length; my $is_project = '$/' eq substr $content, 0, 2; if ( $cur_indent > $last_indent_length ) { $last_indent_length = $cur_indent; $parent_full_fn = $first_full_fn; $first_full_fn = undef; BUG "in Path output, expected filename, not project path '$content'" if $is_project; } if ( $is_project ) { ## Its a line showing a project the cur_fn is shared by. Often ## (as in the above example) a file is in only one project ## but a file may be linked in to two projects. $content =~ s/\r?\n\z//; my $cur_full_fn = "$content/$cur_fn"; $first_full_fn = $cur_full_fn unless defined $first_full_fn; ## The key is in VSS-ese, starts with '$'. The value is ## in RevML-ese, starts with '/'. if ( defined $cur_branched_at ) { my $key = $cur_full_fn; $key = lc $key unless $self->case_sensitive; if ( empty $parent_full_fn ) { ## This *seems* to mean that the version wasn't truely ## branched, perhaps because a Rollback undid a branch ## or a delete or something else; I have no idea. next; } $self->{VSS_BRANCHED_FROM}->{$key} = substr "$parent_full_fn#$cur_branched_at", 1; debug $cur_full_fn, " branched from ", $self->{VSS_BRANCHED_FROM}->{$key} if debugging; } } else { ## Must be another file branched from the same parent. ( $cur_fn, $cur_branched_at ) = $content =~ /\A(.*?\S)(?:\s+\(Branched at version (\d+)\))?\r?\z/ or BUG "in Path output, unable to parse chunk '$content'"; ## The "Branched at version" value is the version number in ## the child file that the branch was created at. The parent ## carries the preceding version number (we hope). $cur_branched_at-- if defined $cur_branched_at; } } } =head1 VSS NOTES We lose comments attached to labels: labels are added to the last "real" (ie non-label-only) revision and the comments are ignored. This can be changed, contact me. We assume a file has always been text or binary, don't think this is stored per-version in VSS. VSS does not track renames by version, so a previous name for a file is lost. VSS lets you add a new file after deleting an old one. This module renames the current file, restores the old one, issues its revisions, then deletes the old on and renames the current file back. In this case, the C<rev_id>s from the current file start at the highest C<rev_id> for the deleted file and continue up. This can cause problems if somebody has the file checked out, use the --undocheckout option to force VCP to undo the checkout and carry on. Looks for deleted files: recovers them if found just long enough to cope with them, then deletes them again. Repeatedly, if need be. NOTE: when recovering a deleted file and using it, the current version takes a "create the smallest window of opportunity to leave the source repository in an uncertain state" approach: it renames the not-deleted version (if any), restores the deleted one, does the History or Get, and then deletes it and renames the not-deleted version back. This is so that if something (the OS, the hardware, AC mains, or even VCP code) crashes, the source repository is left as close to the original state as is possible. This does mean that this module can issue many more commands than minimally necessary; perhaps there should be a --speed-over-safety option or a transaction log & recovery system. No incremental export is supported. VSS' -V~Lfoo option, which says "all versions since this label" does not actually cause the C<ss.exe History> command to emit the indicated checkin. We'll need to make the history command much smarter to implement that. Haven't tested many real-world scenarios yet. If you specify a filespec that matches files branched from files not included in the filespec, VCP pretends that the first revision of the file at the new location is the first revision ever. SS.EXE, which VCP uses for all SourceSafe operations, may ignore it's -I- option, which should prevent it from seeking input, and seek input. This can hang VCP, but it's usually when hitting ^C. This can leave SS.EXE running in a state consuming 100% CPU while waiting for a password. Use the Task Manager to clean up such processes. =over =item * Share-ing a project =back =cut =head1 SEE ALSO L<VCP::Dest::vss>, L<vcp>. =head1 AUTHOR Barrie Slaymaker <barries@slaysys.com> =head1 COPYRIGHT Copyright (c) 2000, 2001, 2002 Perforce Software, Inc. All rights reserved. See L<VCP::License|VCP::License> (C<vcp help license>) for the terms of use. =cut 1
# | Change | User | Description | Committed | |
---|---|---|---|---|---|
#58 | 4509 | Barrie Slaymaker |
- VCP::Source::vss sends "digest" revisions - 95vss2p4.t handles RevML <action> tag |
||
#57 | 4507 | Barrie Slaymaker |
- RevML: - added <action>, removed <delete>, <placeholder> and <move> - added <from_id> for clones (and eventually merge actions) - Simplified DTD (can't branch DTD based on which action any more) - VCP::Source::cvs, VCP::Filter::changesets and VCP::Dest::p4 support from_id in <action>clone</action> records - VCP::Dest::perl_data added - VCP::Rev::action() "branch" added, no more undefined action strings - "placeholder" action removed |
||
#56 | 4487 | Barrie Slaymaker | - dead code removal (thanks to clkao's coverage report) | ||
#55 | 4066 | Barrie Slaymaker | - unknown_VSS_user no longer set, it's up to the destination now | ||
#54 | 4039 | Barrie Slaymaker |
- VCP::Source::scan_metadata() API now in place, - VCP::Source::copy_revs() is fully deprecated. |
||
#53 | 4032 | Barrie Slaymaker | - VCP::Dest::p4 now estimates missing metadata | ||
#52 | 4021 | Barrie Slaymaker |
- Remove all phashes and all base & fields pragmas - Work around SWASHGET error |
||
#51 | 4012 | Barrie Slaymaker | - Remove dependance on pseudohashes (deprecated Perl feature) | ||
#50 | 3991 | Barrie Slaymaker | - VCP::Source::vss uses less RAM on repos with large file counts | ||
#49 | 3970 | Barrie Slaymaker |
- VCP::Source handles rev queing, uses disk to reduce RAM - Lots of other fixes |
||
#48 | 3946 | Barrie Slaymaker |
- VCP::Source::vss now parses history records that do not cause files to have new revisions, such as project labels. |
||
#47 | 3930 | Barrie Slaymaker |
- VCP::Source::cvs and VCP::Dest::p4 handle cloning deletes - "placeholder" actions and is_placeholder_rev() deprecated in favor of is_branch_rev() and is_clone_rev(). - Misc cleanups and minor bugfixes |
||
#46 | 3921 | Barrie Slaymaker |
- VCP::Source::vss uses "0." and "1." prefixes on all rev_ids to properly handle VSS's idea of deleted files. - VCP::Dest::vss now offers --dont-recover-deleted-files to allow VSS-like sources to be trasnferred more completely |
||
#45 | 3896 | Barrie Slaymaker |
- VCP::Source::vss is no longer led astray by VSS' ad hoc (foo is deleted in thes project) annotations in the Paths report |
||
#44 | 3892 | Barrie Slaymaker |
- VCP::Source::vss sets a user name of "unknown_VSS_user" for deletions. |
||
#43 | 3855 | Barrie Slaymaker |
- vcp scan, filter, transfer basically functional - Need more work in re: storage format, etc, but functional |
||
#42 | 3850 | Barrie Slaymaker | - No longer stores all revs in memory | ||
#41 | 3836 | Barrie Slaymaker | - Sources no longer cache all revs in RAM before sending | ||
#40 | 3819 | Barrie Slaymaker | - Factor send & queueing of revs up in to VCP::Source | ||
#39 | 3818 | Barrie Slaymaker | - VCP::Source::{cvs,p4,vsS} use less memory | ||
#38 | 3813 | Barrie Slaymaker | - VCP::Rev::previous() is no more | ||
#37 | 3811 | Barrie Slaymaker | - fetch_*() and get_rev() renamed get_source_file() | ||
#36 | 3742 | Barrie Slaymaker |
- VCP::Source::vss correctly links the first rev of a file to the the delete action its deleted predecessor. |
||
#35 | 3740 | Barrie Slaymaker | - VCP::Source::vss now ignores leading "$" in rev_root | ||
#34 | 3739 | Barrie Slaymaker | - VCP undo of Rename/Restore now occurs even when a BUG surfaces | ||
#33 | 3705 | Barrie Slaymaker |
- VCP::Source::vss can parse all of real_vss_1 - VCP::Source::vss --undocheckout option added |
||
#32 | 3691 | Barrie Slaymaker | - t/91vss2revml.t now passes | ||
#31 | 3682 | Barrie Slaymaker |
- VCP::Source::vss now survives more VCC oddness I don't understand - Directory names with trailing slashes no longer give SS.EXE the fits |
||
#30 | 3681 | Barrie Slaymaker | - VCP now scans much more of real_vss_1 and converts it to revml | ||
#29 | 3679 | Barrie Slaymaker | - VCP::Source::vss respects --case-sensitive in more places | ||
#28 | 3677 | Barrie Slaymaker |
- rev_root sanity check is now case insensitive on Win32 - Parens in source filespecs are now treated as regular characters, not capture groups - ** is not treated as '...' |
||
#27 | 3667 | Barrie Slaymaker |
- VCP-Source-vss.stml now has atomic questions instead of asking for a command-line-like vss: spec |
||
#26 | 3660 | Barrie Slaymaker | - VCP::Source::vss fixups | ||
#25 | 3654 | Barrie Slaymaker |
- VCP-Source-vss UI prompt is more clear - VCP::Source::vss' --cd option removed until a need is found |
||
#24 | 3532 | John Fetkovich |
changed File::Spec->rel2abs( blah, start_dir ) to start_dir_rel2abs blah everywhere. which does the same thing and is defined in VCP::Utils |
||
#23 | 3510 | Barrie Slaymaker | - VSS --continue and branching support | ||
#22 | 3496 | Barrie Slaymaker | - VSS branching | ||
#21 | 3489 | Barrie Slaymaker | - Document options emitted to .vcp files. | ||
#20 | 3477 | Barrie Slaymaker | - Make --rev-root only available in VCP::Source::p4 | ||
#19 | 3462 | Barrie Slaymaker | - Make sure bootstrap regexps get compiled | ||
#18 | 3460 | Barrie Slaymaker |
- Revamp Plugin/Source/Dest hierarchy to allow for reguritating options in to .vcp files |
||
#17 | 3453 | Barrie Slaymaker |
- VCP::Source::vss now reads deleted files, etc. - gentrevml generates more VSS-like RevML - add t/91vss2revml.t (not complete) |
||
#16 | 3433 | Barrie Slaymaker | - Merge in new VSS code. | ||
#15 | 3286 | John Fetkovich |
In 'sub new' constructors of vss source and dest with a new sub, parse_vss_repo_spec in Utils/vss.pm. This also will set the repo_id. Only call parse_vss_repo_spec if the $spec is non-empty. |
||
#14 | 3275 | John Fetkovich | split part of 'sub new' into 'sub init' | ||
#13 | 3206 | John Fetkovich | documentation changes | ||
#12 | 3199 | John Fetkovich | Improved documentation of --bootstrap switch. | ||
#11 | 3155 | Barrie Slaymaker |
Convert to logging using VCP::Logger to reduce stdout/err spew. Simplify & speed up debugging quite a bit. Provide more verbose information in logs. Print to STDERR progress reports to keep users from wondering what's going on. Breaks test; halfway through upgrading run3() to an inline function for speed and for VCP specific features. |
||
#10 | 3133 | Barrie Slaymaker |
Make destinations call back to sources to check out files to simplify the architecture (is_metadata_only() no longer needed) and make it more optimizable (checkouts can be batched). |
||
#9 | 3120 | Barrie Slaymaker | Move changeset aggregation in to its own filter. | ||
#8 | 2837 | John Fetkovich |
Use parse_options rather than using Getopt::Long directly. |
||
#7 | 2802 | John Fetkovich |
Added a source_repo_id to each revision, and repo_id to each Source and Dest. The repo_ids include repository type (cvs,p4,revml,vss,...) and the repo_server fields. Changed the $self->...->set() and $self->...->get() lines in VCP::Dest::* to pass in a conglomerated key value, by passing in the key as an ARRAY ref. Also various restructuring in VCP::DB.pm, VCP::DB_file.pm and VCP::DB_file::sdbm.pm related to this change. |
||
#6 | 2743 | John Fetkovich |
Add fields to vcp: source_name, source_filebranch_id, source_branch_id, source_rev_id, source_change_id 1. Alter revml.dtd to include the fields 2. Alter bin/gentrevml to emit legal RevML 3. Extend VCP::Rev to have the fields 4. Extend VCP::{Source,Dest}::revml to read/write the fields (VCP::Dest::revml should die() if VCP tries to emit illegal RevML) 5. Extend VCP::{Source,Dest}::{cvs,p4} to read the fields 7. Get all tests through t/91*.t to pass except those that rely on ch_4 labels |
||
#5 | 2389 | John Fetkovich |
removed calls to methods: command_stderr_filter command_ok_result_codes command_chdir and replaced with named Plugin::run_safely method parameters stderr_filter ok_result_codes in_dir respectively, where possible. |
||
#4 | 2322 | Barrie Slaymaker | Fix jack-in-the-bug options parsing exposed by .vcp files | ||
#3 | 1855 | Barrie Slaymaker |
Major VSS checkin. Works on Win32 |
||
#2 | 1822 | Barrie Slaymaker |
Get all other tests passing but VSS. Add agvcommenttime sort field. |
||
#1 | 1810 | Barrie Slaymaker | Preliminary VSS checkin |