Source.pm #17

  • //
  • guest/
  • perforce_software/
  • revml/
  • lib/
  • VCP/
  • Source.pm
  • View
  • Commits
  • Open Download .zip Download (9 KB)
package VCP::Source ;

=head1 NAME

VCP::Source - A base class for repository sources

=head1 SYNOPSIS

=head1 DESCRIPTION

=head1 EXTERNAL METHODS

=cut

$VERSION = 0.1 ;

use strict ;

use UNIVERSAL qw( isa ) ;
use VCP::Debug qw( :debug ) ;
use VCP::Logger qw( lg BUG );

use base 'VCP::Driver' ;

use fields (
   'BOOTSTRAP',         ## The raw option so we can regurgitate it
   'BOOTSTRAP_REGEXPS', ## Determines what files are in bootstrap mode.
   'DEST',
   'CONTINUE',          ## Set if we're resuming from the prior
                        ## copy operation, if there is one.  This causes
                        ## us to determine a minimum rev by asking the
                        ## destination what it's seen on a given filebranch
) ;


sub init {
   my VCP::Source $self = shift;
   $self->bootstrap( $self->{BOOTSTRAP} );
}


###############################################################################

=head1 SUBCLASSING

This class uses the fields pragma, so you'll need to use base and 
possibly fields in any subclasses.  See L<VCP::Plugin> for methods
often needed in subclasses.

=head2 Subclass utility API

=over

=item options_spec

Adds common VCP::Source options to whatever options VCP::Plugin parses:

=over

=item *

"--bootstrap=pattern"

Forces all files matching the given shell regular expression (may use
wildcards like "*", "?", and "...") to have their first revisions
transferred as complete copies instead of deltas.  This is useful when
you want to transfer a revision other than the first revision as the
first revision in the target repository.  It is also useful when you
want to skip some revisions in the target repository (although the L<Map
filter|VCP::Filter::map> has superceded this use).

=item *

--continue

Tells VCP to continue where it left off from last time.  This will not
detect new branches of already transferred revisions (this limitation
should be lifted, but results in an expensive rescan of metadata), but
will detect updates to already transferred revisions.

=item * 

--rev-root=/path/to/top/dir

This forces the source driver to name all files sent relative to this
path instead of the path it normally would deduce from the filespec
portion of the source repository specification.  This is useful in cases
where you're moving a group of files and you want them to have extra
leading directory names on each file.

For example, if you're transferring the contents of /a/b/c/..., the
source would normally deduce a rev_root of "/a/b/c" and all of the
individual revisions would be named relative to that root, so
"/a/b/c/d/e/f/g" would be named "d/e/f/g".  Specifying a rev_root of
"/a" would result in a name of "b/c/d/e/f/g" instead.

This may also be usefule when you're specifying an option that might
include files not included in the filespec.  For an example of such an
option, see VCP::Source::p4's C<--follow-branch-into>).  The need for an
option like this for this purpose is a limitation, the revision root
deduction logic should be smart enough to deduce a root from the actual
files extracted and not from the filespec.

=back

=cut

sub options_spec {
   my VCP::Source $self = shift;
   return (
      $self->SUPER::options_spec,
      "bootstrap|b=s"    => \$self->{BOOTSTRAP},
      "continue"         => \$self->{CONTINUE},
      "rev-root"         => sub { shift; $self->rev_root( @_ ) },
   );
}

=item dest

Sets/Gets a reference to the VCP::Dest object.  The source uses this to
call handle_header(), handle_rev(), and handle_end() methods.

=cut

sub dest {
   my VCP::Source $self = shift ;

   $self->{DEST} = shift if @_ ;
   return $self->{DEST} ;
}


=item continue

Sets/Gets the CONTINUE field (which the user sets via the --continue flag)

=cut

sub continue {
   my VCP::Source $self = shift ;

   $self->{CONTINUE} = shift if @_ ;
   return $self->{CONTINUE} ;
}



=back

=head1 SUBCLASS OVERLOADS

These methods should be overridded in any subclasses.

=over

=item copy

REQUIRED OVERLOAD.

   $source->copy_revs() ;

Called by L<VCP/copy> to do the entire export process.  This is passed a
partially filled-in header structure.

The subclass should call this to move all the revisions over to the
destination:

   $self->SUPER::copy_revs( $revs );

If $revs, an ARRAY containing revisions, is not passed in,
$self->revs->remove_all() is used.

=cut

sub copy_revs {
   my VCP::Source $self = shift ;
   my ( $revs ) = @_;
   $revs ||= $self->revs->remove_all;
   VCP::Revs->set_file_fetcher( $self );
   for my $i ( 0..$#$revs ) {
      $self->dest->handle_rev( $revs->[$i] );
      $revs->[$i] = undef;
   }
}


=item fetch_files

Calls get_rev( $r ) for each parameter.

Overload this if you can batch requests more efficiently.

=cut

sub fetch_files {
   my VCP::Source $self = shift ;
   map $self->get_rev( $_ ), @_;
}


=item handle_header

REQUIRED OVERLOAD.

Subclasses must add all repository-specific info to the $header, at least
including rep_type and rep_desc.

   $header->{rep_type} => 'p4',
   $self->p4( ['info'], \$header->{rep_desc} ) ;

The subclass must pass the $header on to the dest:

   $self->dest->handle_header( $header ) ;

=cut

sub handle_header {
   my VCP::Source $self = shift ;

#   my ( $header ) = @_ ;

   BUG "ERROR: copy not overloaded by class '", ref $self, "'.  Oops.\n";
#      if $self->can( 'handle_header' ) eq \&handle_header ;

#   $self->dest->handle_header( $header ) ;
}


=item handle_footer

Not a required overload, as the footer carries no useful information at
this time.  Overriding methods must call this method to pass the
$footer on:

   $self->SUPER::handle_footer( $footer ) ;

=cut

sub handle_footer {
   my VCP::Source $self = shift ;

   my ( $footer ) = @_ ;

   $self->dest->handle_footer( $footer ) ;
   VCP::Revs->set_file_fetcher( undef );
}


=item parse_time

   $time = $self->parse_time( $timestr ) ;

Parses "[cc]YY/MM/DD[ HH[:MM[:SS]]]".

Will add ability to use format strings in future.
HH, MM, and SS are assumed to be 0 if not present.

Returns a time suitable for feeding to localtime or gmtime.

Assumes local system time, so no good for parsing times in revml, but that's
not a common thing to need to do, so it's in VCP::Source::revml.pm.

=cut

{
    ## This routine is slow and gets called a *lot* with duplicate
    ## inputs, at least by VCP::Source::cvs, so we memoize it.
    my %cache;

    sub parse_time {
       my VCP::Source $self = shift ;
       my ( $timestr ) = @_ ;

       return $cache{$timestr} ||= do {
           ## TODO: Get parser context here & give file, line, and column.
           ## filename and rev too, while we're scheduling more work for
           ## the future.
           BUG "Malformed time value $timestr\n"
              unless $timestr =~ /^(\d\d)?\d?\d(\D\d?\d){2,5}/ ;
           my @f = split( /\D/, $timestr ) ;
           if (
              length $f[0] <= 2
              && $f[0] <= 12
              && ( length $f[2] == 4
                 || $f[2] > 12
                 || "0" eq substr( $f[2], 0, 1 )
              )
           ) {
              ## Must be MM/DD/YY, or MM/DD/YYYY.  timelocal() needs
              ## YY(YY)?/MM/DD
              splice @f, 0, 3, ( $f[2], $f[0], $f[1] );
           }

           --$f[1] ; # Month of year needs to be 0..11
           push @f, ( 0 ) x ( 6 - @f ) ;
           require Time::Local;
           my $t = eval { Time::Local::timelocal( reverse @f ) };
           BUG $@ unless defined $t;
           return $t;
        }
    }
}


=item bootstrap

Sets (and parses) or gets the bootstrap spec.

Can be called plain:

   $self->bootstrap( $bootstrap_spec ) ;

See the command line documentation for the format of $bootstrap_spec.

=cut

sub bootstrap {
   my VCP::Source $self = shift ;
   if ( @_ ) {
      my ( $val ) = @_ ;
      $self->{BOOTSTRAP} = $val;
      $self->{BOOTSTRAP_REGEXPS} = [
         defined $val
            ? do {
               require Regexp::Shellish;
                  map Regexp::Shellish::compile_shellish( $_ ),
                     split /,+/, $val
            }
            : ()
       ];
    }

   return $self->{BOOTSTRAP};
}


=item is_bootstrap_mode

   ... if $self->is_bootstrap_mode( $file ) ;

Compares the filename passed in against the list of bootstrap regular
expressions set by L</bootstrap>.

The file should be in a format similar to the command line spec for
whatever repository is passed in, and not relative to rev_root, so
"//depot/foo/bar" for p4, or "module/foo/bar" for cvs.

This is typically called in the subbase class only after looking at the
revision number to see if it is a first revision (in which case the
subclass should automatically put it in bootstrap mode).

=cut

sub is_bootstrap_mode {
   my VCP::Source $self = shift ;
   my ( $file ) = @_ ;

   my $result = grep $file =~ $_, @{$self->{BOOTSTRAP_REGEXPS}} ;

   lg(
      "$file ",
      ( $result ? "=~ " : "!~ " ),
      "[ ", join( ', ', map "qr/$_/", @{$self->{BOOTSTRAP_REGEXPS}} ), " ] (",
      ( $result ? "not in " : "in " ),
      "bootstrap mode)"
   ) if debugging;

   return $result ;
}

=back

=head1 COPYRIGHT

Copyright 2000, Perforce Software, Inc.  All Rights Reserved.

This module and the VCP package are licensed according to the terms given in
the file LICENSE accompanying this distribution, a copy of which is included in
L<vcp>.

=head1 AUTHOR

Barrie Slaymaker <barries@slaysys.com>

=cut

1
# Change User Description Committed
#46 5404 Barrie Slaymaker - SVN support added
- Makefile gives clearer notices about missing optional
  prereqs.
- VCP::Filter::labelmap and VCP::Filter::map: <<skip>> replaces
  deprecated <<delete>> to be clearer that no revisions
  are deleted from either repository but some just are
  skipped and not inserted.
- VCP::Filter::map: support added for SVN-like branch labels
- VCP::Source: support added for ISO8601 timestamps
  emitted by SVN.
#45 5082 Barrie Slaymaker - VCP::Source tells VCP::Rev to uncache the source to allow
  the source instance to be DESTROYed and thus clean up its
  working files.
#44 5078 Barrie Slaymaker - VCP::Source::parse_time() 0s out undefined/missing fields
#43 4500 Barrie Slaymaker - Minor POD cleanup
#42 4497 Barrie Slaymaker - --rev-root documented
       - All destinations handle rev_root defaulting now
#41 4487 Barrie Slaymaker - dead code removal (thanks to clkao's coverage report)
#40 4135 Barrie Slaymaker - Time fields may have trailing AM/PM or A/P without leading whitespace
#39 4134 Barrie Slaymaker - "AM", "PM", "A", and "P" (case insensitive) are now parsed
  properly when parsing time values
#38 4039 Barrie Slaymaker - VCP::Source::scan_metadata() API now in place,
- VCP::Source::copy_revs() is fully deprecated.
#37 4021 Barrie Slaymaker - Remove all phashes and all base & fields pragmas
- Work around SWASHGET error
#36 3982 Barrie Slaymaker - VCP::Source no longer leaks memory by delete()ing from a phash
- VCP::Source::cvs now flushes to disk more often to conserve RAM
#35 3970 Barrie Slaymaker - VCP::Source handles rev queing, uses disk to reduce RAM
- Lots of other fixes
#34 3922 Barrie Slaymaker - More paranoid paramter checking
#33 3916 Barrie Slaymaker - Reduce memory consumption
#32 3907 Barrie Slaymaker - Debugging cleanups
#31 3898 Barrie Slaymaker - VCP::Source::* --rev-root reinstanted
#30 3855 Barrie Slaymaker - vcp scan, filter, transfer basically functional
    - Need more work in re: storage format, etc, but functional
#29 3835 Barrie Slaymaker - VCP::Source supports queuing of revs and facilities for
  sending revs ASAP to conserve memory
#28 3820 Barrie Slaymaker - VCP::Source::revml now uses VCP::Source's queueing methods
    - For maintainability only, does not decrease memory util.
#27 3819 Barrie Slaymaker - Factor send & queueing of revs up in to VCP::Source
#26 3811 Barrie Slaymaker - fetch_*() and get_rev() renamed get_source_file()
#25 3806 Barrie Slaymaker - VCP::Source no longer tries to send to a missing dest
#24 3804 Barrie Slaymaker - Refactored to prepare way for reducing memory footprint
#23 3706 Barrie Slaymaker - VCP gives some indication of output progress (need more)
#22 3687 Barrie Slaymaker - Destinations may now use compile_path_re()
#21 3681 Barrie Slaymaker - VCP now scans much more of real_vss_1 and converts it to revml
#20 3679 Barrie Slaymaker - VCP::Source::vss respects --case-sensitive in more places
#19 3677 Barrie Slaymaker - rev_root sanity check is now case insensitive on Win32
- Parens in source filespecs are now treated as regular
  characters, not capture groups
- ** is not treated as '...'
#18 3477 Barrie Slaymaker - Make --rev-root only available in VCP::Source::p4
#17 3460 Barrie Slaymaker - Revamp Plugin/Source/Dest hierarchy to allow for
  reguritating options in to .vcp files
#16 3445 Barrie Slaymaker - Don't misparse YYYY/MM/DD dates as MMMM/DD/YY.
- t/61sort.t no longer blows up due to VCP::Rev's new
  BUG checks.
#15 3443 Barrie Slaymaker - Use BUG instead of Carp::confess
- Recognize MM/DD/YY format dates
#14 3157 Barrie Slaymaker debug conversion to VCP::Logger
#13 3155 Barrie Slaymaker Convert to logging using VCP::Logger to reduce stdout/err spew.
       Simplify & speed up debugging quite a bit.
       Provide more verbose information in logs.
       Print to STDERR progress reports to keep users from wondering
       what's going on.
       Breaks test; halfway through upgrading run3() to an inline
       function for speed and for VCP specific features.
#12 3133 Barrie Slaymaker Make destinations call back to sources to check out files to
       simplify the architecture (is_metadata_only() no longer needed)
       and make it more optimizable (checkouts can be batched).
#11 3131 Barrie Slaymaker Double the speed of the RCS file parser.
       Deprecate VCP::Revs::shift() in favor of remove_all().
#10 2824 John Fetkovich removed CVS_CONTINUE field from Source/cvs.pm, and added
       CONTINUE field and continue accessor to Source.pm.  Moved parsing
       of the --continue option also.
#9 2809 Barrie Slaymaker Implement --repo-id in Plugin.pm, refactor source & dest
       options parsing starting in VCP::Source::cvs (need to
       roll out to other sources and dests), get t/91cvs2revml.t
       passing again (first time in months! branching and
       --continue support works in cvs->foo!).
#8 2453 John Fetkovich removed compilation of revml.
 will be making that a separate executable.
#7 2293 Barrie Slaymaker Update CHANGES, TODO, improve .vcp files, add --init-cvs
#6 2015 Barrie Slaymaker submit changes
#5 1998 Barrie Slaymaker Initial, revml and core VCP support for branches
#4 1809 Barrie Slaymaker VCP::Patch should ignore lineends
#3 628 Barrie Slaymaker Cleaned up POD in bin/vcp, added BSD-style license.
#2 468 Barrie Slaymaker - VCP::Dest::p4 now does change number aggregation based on the
  comment field changing or whenever a new revision of a file with
  unsubmitted changes shows up on the input stream.  Since revisions of
  files are normally sorted in time order, this should work in a number
  of cases.  I'm sure we'll need to generalize it, perhaps with a time
  thresholding function.
- t/90cvs.t now tests cvs->p4 replication.
- VCP::Dest::p4 now doesn't try to `p4 submit` when no changes are
  pending.
- VCP::Rev now prevents the same label from being applied twice to
  a revision.  This was occuring because the "r_1"-style label that
  gets added to a target revision by VCP::Dest::p4 could duplicate
  a label "r_1" that happened to already be on a revision.
- Added t/00rev.t, the beginnings of a test suite for VCP::Rev.
- Tweaked bin/gentrevml to comment revisions with their change number
  instead of using a unique comment for every revision for non-p4
  t/test-*-in-0.revml files.  This was necessary to test cvs->p4
  functionality.
#1 467 Barrie Slaymaker Version 0.01, initial checkin in perforce public depot.