README #1

  • //
  • guest/
  • raymond_danks/
  • perforce/
  • utils/
  • utf8conv/
  • utf8/
  • README
  • View
  • Commits
  • Open Download .zip Download (10 KB)
Perforce Checkpoint UTF-8 conversion tools - Release Notes
Version 1.00.00, General Release
Release Date: November 29, 2004

--------------------------------------------------------------------------------
PRODUCT INFORMATION 
--------------------------------------------------------------------------------
Perforce servers may be updated with Internationalization Support.  This support
is detailed on the Perforce website:

http://www.perforce.com/perforce/technotes/note066.html

Before the update is performed, however, the Perforce database must not contain
any characters in the Extended ASCII or ANSI character set.  These characters 
have decimal equivalents of 128-255 or are any character with the high bit set.

When attempting to update a Perforce database to support UTF-8, an error message
similar to the following may occur:

bash-2.05b$ p4d -xi
Table db.domain has 5 rows with invalid UTF8.
Table db.have has 15 rows with invalid UTF8.
Table db.label has 45 rows with invalid UTF8.
Table db.rev has 10 rows with invalid UTF8.
Table db.revcx has 10 rows with invalid UTF8.
Table db.desc has 54 rows with invalid UTF8.
Perforce server error:
Database has 6 tables with non-UTF8 text and can't be switched to Unicode mode.

To resolve this error, all Extended ASCII characters must be removed from the 
database.

The program utf8find, besides generating a listing of all extended ASCII 
characters in the Perforce database, will create a duplicate of the database 
checkpoint. All characters found that are not a part of the standard ASCII 
character set will be replaced with an XML tag: 

<UTF8_FIND_150> 

where 150 is the decimal equivalent of the extended ASCII character found. 

The script parse_failures.txt operates against the STDOUT output of utf8find.
The script will output human readable text damage report.  The damage report
will be used as a reference when modifying utf8conv.c

utf8conv performs the final changes to the Perforce checkpoint.  It replaces 
the <UTF8_FIND_xxx> tags with the appropriate UTF-8 character and updates the 
UTF-8 enabled checkpoint for a final restore.

This tool does NOT affect any of the RCS files or their content.  This tool
should be used ONLY to update the Perforce metadata (i.e. the db.xxx files or 
their respective checkpoint) Modifications to RCS filenames should be made 
manually after the tool has completed its task.

This program was written for and tested under Linux running a 2.4.21 kernel. 
The perl script was tested under perl, v5.8.0 built for x86_64.  The tool was
also build under Windows XP and tested against a server running under Windows 
XP. If running under Windows, please execute "utf8find.exe" when "utf8find" is 
referenced, "utf8conv.exe" where utf8conv is referenced, and "perl 
parse_failures.pl" where "parse_failures.pl" is referenced.  

--------------------------------------------------------------------------------
UNIT TEST 
--------------------------------------------------------------------------------
The INSTALLATION INSTRUCTIONS were followed against a production Perforce 
database.  

The server version used was:

Server version: P4D/LINUX26AMD64/2004.2/70468 (2004/10/25)

--------------------------------------------------------------------------------
INSTALLATION INSTRUCTIONS 
--------------------------------------------------------------------------------
Follow these steps to update a Perforce database containing Extended ASCII 
characters to support UTF-8.  It is usually wise to perform these steps against
a replica of the production database to limit user downtime and ensure proper
operation.

For specific details on the Perforce commands mentioned here, please refer to
the appropriate Perforce documentation.

* Shutdown the server 
	p4 admin stop
* Checkpoint server 
	change directory to the server root.
	p4d -r . -jc
* Execute utf8find to replace all extended ASCII characters with meta tags.
	ln -s checkpoint.dat checkpoint.353   
	utf8find > failures.txt

		Note: 	utf8find reads a file called checkpoint.dat and writes a 
			file called utfcheckpoint.dat.  Create the necessary 
			symlink to avoid a copy of the checkpoint you created 
			in the previous step.

			Under Win32, the checkpoint should be copied or renamed to
			checkpoint.dat. There is no symbolic link command as such
			under Windows.
* Restore utfcheckpoint.dat
	mkdir temp_database1
	cd temp_database1
	p4d -r . -jr ../utfcheckpoint.dat
* Upgrade restored database
	p4d -r . -xi 
* Checkpoint upgraded database 
	p4d -r . -jc 
* Analyze the failures
	parse_failures.pl failures.txt > report.txt

	Look towards the bottom of the file for a list of characters that got 
	remapped. You will need to update the table in utf8conv.c with the 
	appropriate UTF-8 encoding for each of these characters.

	Look for changes to db.have, db.domain, and db.rev and note that any 
	changes	made to these tables by the script must be made to the RCS files 
	stored on the server file system.  You must locate these versioned files 
	and manually rename each file.  You can get the appropriate name for 
	these files by examining the checkpoint utfcheckpoint.utf created in the 
	next step.
* Execute utf8conv to replace all XML meta tags with UTF-8 character encoding.
	ln -s checkpoint.utf checkpoint.354
	utf8conv

		Note: 	utf8conv reads a file called checkpoint.utf and writes a 
			file called utfcheckpoint.utf.  Create the necessary 
			symlink to avoid a copy of the checkpoint you created 
			previously.

			Under Win32, the checkpoint should be copied or renamed to
			checkpoint.dat. There is no symbolic link command as such
			under Windows.
* Restore utfcheckpoint.utf 
	mkdir temp_database2
	cd temp_database2
	p4d -r . -jr ../utfcheckpoint.utf
* Startup the server. 
	p4d
* Verify that the metadata appears correct. 
	Use a Perforce client to browse the Perforce metadata to ensure the 
	conversion was a success.


--------------------------------------------------------------------------------
BUILD INSTRUCTIONS 
--------------------------------------------------------------------------------
This package comes with a makefile for building the sources.

Extract the source tarball and build from the utf8 directory:

	tar -xzvf utf8conv.tar.gz
	cd utf8
	make

A Makefile.win32 has been supplied for building under Windows.  Please configure
your environment to build using Microsoft Program Maintenance Utility (nmake)
and Microsoft 32bit C/C++ Compiler (cl).

After extracting the source tarball, build from the utf8 directory:

	cd utf8
	nmake

================================================================================
Copyright
---------
Portions  2004 Advanced Micro Devices, Inc. All rights reserved.
The contents of this document are provided in connection with Advanced Micro 
Devices, Inc. (AMD) products. AMD makes no representations or warranties 
with respect to the accuracy or completeness of the contents of this 
publication and reserves the right to make changes to specifications and product 
descriptions at any time without notice. No license, whether express, implied, 
arising by estoppel or otherwise, to any intellectual property rights is granted 
by this publication. Except as set forth in AMDs Standard Terms and Conditions 
of Sale, AMD assumes no liability whatsoever, and disclaims any express or 
implied warranty, relating to its products including, but not limited to, the 
implied warranty of merchantability, fitness for a particular purpose, or 
infringement of any intellectual property right. AMDs products are not designed, 
intended, authorized or warranted for use as components in systems intended for 
surgical implant into the body, or in other applications intended to support or 
sustain life, or in any other application in which the failure of AMDs product 
could create a situation where personal injury, death, or severe property or 
environmental damage may occur. AMD reserves the right to discontinue or make 
changes to its products at any time without notice.


Trademarks
----------
AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced 
Micro Devices, Inc.
 
Other product names used in this publication are for identification purposes only 
and may be trademarks of their respective companies.


Open Source License
-------------------
Redistribution and use in source and binary forms, with or without modification, 
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this 
list of conditions and the following disclaimer. 
* Redistributions in binary form must reproduce the above copyright notice, this 
list of conditions and the following disclaimer in the documentation and/or other 
materials provided with the distribution. 
* Neither Advanced Micro Devices, Inc. nor the names of its contributors may be 
used to endorse or promote products derived from this software without specific 
prior written permission. 
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND 
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, 
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR 
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, 
STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF 
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
================================================================================
# Change User Description Committed
#1 4673 Raymond Danks UTF8Conv_1.00.00:

This tool originates in the AMD PCS LDC Perforce Server //admin/utf8 as of changelist 53132.

Please review the README and unit_test.txt documentation for usage and purpose.