Perforce Checkpoint UTF-8 conversion tools - Release Notes Version 1.00.00, General Release Release Date: November 29, 2004 -------------------------------------------------------------------------------- PRODUCT INFORMATION -------------------------------------------------------------------------------- Perforce servers may be updated with Internationalization Support. This support is detailed on the Perforce website: http://www.perforce.com/perforce/technotes/note066.html Before the update is performed, however, the Perforce database must not contain any characters in the Extended ASCII or ANSI character set. These characters have decimal equivalents of 128-255 or are any character with the high bit set. When attempting to update a Perforce database to support UTF-8, an error message similar to the following may occur: bash-2.05b$ p4d -xi Table db.domain has 5 rows with invalid UTF8. Table db.have has 15 rows with invalid UTF8. Table db.label has 45 rows with invalid UTF8. Table db.rev has 10 rows with invalid UTF8. Table db.revcx has 10 rows with invalid UTF8. Table db.desc has 54 rows with invalid UTF8. Perforce server error: Database has 6 tables with non-UTF8 text and can't be switched to Unicode mode. To resolve this error, all Extended ASCII characters must be removed from the database. The program utf8find, besides generating a listing of all extended ASCII characters in the Perforce database, will create a duplicate of the database checkpoint. All characters found that are not a part of the standard ASCII character set will be replaced with an XML tag: where 150 is the decimal equivalent of the extended ASCII character found. The script parse_failures.txt operates against the STDOUT output of utf8find. The script will output human readable text damage report. The damage report will be used as a reference when modifying utf8conv.c utf8conv performs the final changes to the Perforce checkpoint. It replaces the tags with the appropriate UTF-8 character and updates the UTF-8 enabled checkpoint for a final restore. This tool does NOT affect any of the RCS files or their content. This tool should be used ONLY to update the Perforce metadata (i.e. the db.xxx files or their respective checkpoint) Modifications to RCS filenames should be made manually after the tool has completed its task. This program was written for and tested under Linux running a 2.4.21 kernel. The perl script was tested under perl, v5.8.0 built for x86_64. The tool was also build under Windows XP and tested against a server running under Windows XP. If running under Windows, please execute "utf8find.exe" when "utf8find" is referenced, "utf8conv.exe" where utf8conv is referenced, and "perl parse_failures.pl" where "parse_failures.pl" is referenced. -------------------------------------------------------------------------------- UNIT TEST -------------------------------------------------------------------------------- The INSTALLATION INSTRUCTIONS were followed against a production Perforce database. The server version used was: Server version: P4D/LINUX26AMD64/2004.2/70468 (2004/10/25) -------------------------------------------------------------------------------- INSTALLATION INSTRUCTIONS -------------------------------------------------------------------------------- Follow these steps to update a Perforce database containing Extended ASCII characters to support UTF-8. It is usually wise to perform these steps against a replica of the production database to limit user downtime and ensure proper operation. For specific details on the Perforce commands mentioned here, please refer to the appropriate Perforce documentation. * Shutdown the server p4 admin stop * Checkpoint server change directory to the server root. p4d -r . -jc * Execute utf8find to replace all extended ASCII characters with meta tags. ln -s checkpoint.dat checkpoint.353 utf8find > failures.txt Note: utf8find reads a file called checkpoint.dat and writes a file called utfcheckpoint.dat. Create the necessary symlink to avoid a copy of the checkpoint you created in the previous step. Under Win32, the checkpoint should be copied or renamed to checkpoint.dat. There is no symbolic link command as such under Windows. * Restore utfcheckpoint.dat mkdir temp_database1 cd temp_database1 p4d -r . -jr ../utfcheckpoint.dat * Upgrade restored database p4d -r . -xi * Checkpoint upgraded database p4d -r . -jc * Analyze the failures parse_failures.pl failures.txt > report.txt Look towards the bottom of the file for a list of characters that got remapped. You will need to update the table in utf8conv.c with the appropriate UTF-8 encoding for each of these characters. Look for changes to db.have, db.domain, and db.rev and note that any changes made to these tables by the script must be made to the RCS files stored on the server file system. You must locate these versioned files and manually rename each file. You can get the appropriate name for these files by examining the checkpoint utfcheckpoint.utf created in the next step. * Execute utf8conv to replace all XML meta tags with UTF-8 character encoding. ln -s checkpoint.utf checkpoint.354 utf8conv Note: utf8conv reads a file called checkpoint.utf and writes a file called utfcheckpoint.utf. Create the necessary symlink to avoid a copy of the checkpoint you created previously. Under Win32, the checkpoint should be copied or renamed to checkpoint.dat. There is no symbolic link command as such under Windows. * Restore utfcheckpoint.utf mkdir temp_database2 cd temp_database2 p4d -r . -jr ../utfcheckpoint.utf * Startup the server. p4d * Verify that the metadata appears correct. Use a Perforce client to browse the Perforce metadata to ensure the conversion was a success. -------------------------------------------------------------------------------- BUILD INSTRUCTIONS -------------------------------------------------------------------------------- This package comes with a makefile for building the sources. Extract the source tarball and build from the utf8 directory: tar -xzvf utf8conv.tar.gz cd utf8 make A Makefile.win32 has been supplied for building under Windows. Please configure your environment to build using Microsoft Program Maintenance Utility (nmake) and Microsoft 32bit C/C++ Compiler (cl). After extracting the source tarball, build from the utf8 directory: cd utf8 nmake ================================================================================ Copyright --------- Portions © 2004 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. No license, whether express, implied, arising by estoppel or otherwise, to any intellectual property rights is granted by this publication. Except as set forth in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied warranty of merchantability, fitness for a particular purpose, or infringement of any intellectual property right. AMD’s products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other application in which the failure of AMD’s product could create a situation where personal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Trademarks ---------- AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. Open Source License ------------------- Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither Advanced Micro Devices, Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ================================================================================