Prentice hall optimizing c plus plus jul 1998 ISBN 0139774300

557 29 0
Prentice hall optimizing c plus plus jul 1998 ISBN 0139774300

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

[NEXT] ISBN: 0-13-977430-0 Copyright 1999 by Prentice-Hall PTR Copyright 2000 by Chrysalis Software Corporation Would you like to discuss interesting topics with intelligent people? If so, you might be interested in Colloquy, the world's first internet-based high-IQ society I'm one of the Regents of this society, and am responsible for the email list and for membership applications Application instructions are available here Imagine that you are about to finish a relatively large program, one that has taken a few weeks or months to write and debug Just as you are putting the finishing touches on it, you discover that it is either too slow or runs out of memory when you feed it a realistic set of input data You sigh, and start the task of optimizing it But why optimize? If your program doesn't fit in memory, you can just get more memory; if it is too slow, you can get a faster processor I have written Optimizing C++ because I believe that this common attitude is incorrect, and that a knowledge of optimization is essential to a professional programmer One very important reason is that we often have little control over the hardware on which our programs are to be run In this situation, the simplistic approach of adding more hardware is not feasible Optimizing C++ provides working programmers and those who intend to be working programmers with a practical, real-world approach to program optimization Many of the optimization techniques presented are derived from my reading of academic journals that are, sadly, little known in the programming community This book also draws on my nearly 30 years of experience as a programmer in diverse fields of application, during which I have become increasingly concerned about the amount of effort spent in reinventing optimization techniques rather than applying those already developed The first question you have to answer is whether your program needs optimization at all If it does, you have to determine what part of the program is the culprit, and what resource is being overused Chapter 1 indicates a method of attack on these problems, as well as a real-life example All of the examples in this book were compiled with both Microsoft's Visual C++ 5.0 and the DJGPP compiler, written and copyrighted by DJ Delorie The latter compiler is available here The source code for the examples is available here If you want to use DJGPP, I recommend that you also get RHIDE, an integrated development environment for the DJGPP compiler, written and copyrighted by Robert Hoehne, which is available here All of the timings and profiling statistics, unless otherwise noted, were the result of running the corresponding program compiled with Visual C++ 5.0 on my Pentium II 233 Megahertz machine with 64 megabytes of memory I am always happy to receive correspondence from readers If you wish to contact me, the best way is to visit my WWW home page If you prefer, you can email me In the event that you enjoy this book and would like to tell others about it, you might want to write an on-line review on Amazon.com, which you can do here I should also tell you how the various typefaces are used in the book HelveticaNarrow is used for program listings, for terms used in programs, and for words defined by the C++ language Italics are used primarily for technical terms that are found in the glossary, although they are also used for emphasis in some places The first time that I use a particular technical term that you might not know, it is in bold face Now, on with the show! Dedication Acknowledgements Prologue A Supermarket Price Lookup System A Mailing List System Cn U Rd Ths (Qkly)? A Data Compression Utility Free at Last: An Efficient Method of Handling Variable-Length Records Heavenly Hash: A Dynamic Hashing Algorithm Zensort: A Sorting Algorithm for Limited Memory Mozart, No Would You Believe Gershwin? About the Author This book is dedicated to Susan Patricia Caffee Heller, the light of my life Without her, this book would not be what it is; even more important, I would not be what I am: a happy man Acknowledgements I'd like to thank all those readers who have provided feedback on the first two editions of this book, especially those who have posted reviews on Amazon.com; their contributions have made this a better book I'd also like to thank Jeff Pepper, my editor at Prentice-Hall, for his support and encouragement Without him, this third edition would never have been published Finally, I would like to express my appreciation to John P Linderman at AT&T Labs Research for his help with the code in the chapter on sorting immense files Prologue Introduction to Optimization What is optimization anyway? Clearly, we have to know this before we can discuss how and why we should optimize programs Definition Optimization is the art and science of modifying a working computer program so that it makes more efficient use of one or more scarce resources, primarily memory, disk space, or time This definition has a sometimes overlooked but very important corollary (The First Law of Optimization): The speed of a nonworking program is irrelevant Algorithms Discussed Radix40 Data Representation, Lookup Tables1 Displacement[i] = 0; TotalDisplacement[i] = 0; } for (i = 0; ; i ++) { InputFile.getline(InputLine,INPUTLINESIZE); if (!InputFile) break; TotalKeys ++; LineLength = strlen(InputLine) + 1; strcpy(InputLine+LineLength-1,"\n"); KeySegment = CalculateKeySegment(Pass,InputLine,LineLength); Displacement[KeySegment] += LineLength; } InputFile.close(); for (i = 1; i < BUFCOUNT; i ++) TotalDisplacement[i] = TotalDisplacement[i-1] + Displacement[i-1]; if (TotalData == 0) { for (i = 0; i < BUFCOUNT; i ++) TotalData += Displacement[i]; BufferRatio = (double) TOTAL_BUFFER / TotalData; } TotalBufferSize = 0; for (i = 0; i < BUFCOUNT; i ++) { BufferSize[i] = (int) (BufferRatio*Displacement[i]); Buffer[i] = BigBuffer + TotalBufferSize; TotalBufferSize += BufferSize[i]; BufferCharCount[i] = 0; } memset(BigBuffer,0,TOTAL_BUFFER); if ((Pass == PassCount - 1) && StatisticsDisplayed == false) { printf("Total buffer space: %d\n",TOTAL_BUFFER); printf("Total keys: %d\n", TotalKeys); printf("Total data: %d\n", TotalData); StatisticsDisplayed = true; } sprintf(temp,"Finished counting on pass %d",PassCount-Pass); timing(temp); InputFile.open(InputFileName,ios::in|ios::binary); for (i = 0; ; i ++) { InputFile.getline(InputLine,INPUTLINESIZE); if (!InputFile) break; LineLength = strlen(InputLine)+1; strcpy(InputLine+LineLength-1,"\n"); KeySegment = CalculateKeySegment(Pass,InputLine,LineLength); CurrentLength = BufferCharCount[KeySegment]; if (LineLength > BufferSize[KeySegment]) { OutputFile.seekp(TotalDisplacement[KeySegment]); if (CurrentLength > 0) OutputFile.write(Buffer[KeySegment],CurrentLength); BufferCharCount[KeySegment] = 0; OutputFile.write(InputLine,LineLength); TotalDisplacement[KeySegment] += CurrentLength + LineLength; TotalWrites ++; continue; } NewLength = CurrentLength + LineLength; if (NewLength >= BufferSize[KeySegment]) { PartialLength = BufferSize[KeySegment] - CurrentLength; memcpy(Buffer[KeySegment]+CurrentLength, InputLine,PartialLength); CurrentLength = BufferSize[KeySegment]; OutputFile.seekp(TotalDisplacement[KeySegment]); OutputFile.write(Buffer[KeySegment],CurrentLength); TotalDisplacement[KeySegment] += CurrentLength; TotalWrites ++; memset(Buffer[KeySegment],0,CurrentLength); memcpy(Buffer[KeySegment],InputLine+PartialLength, LineLength-PartialLength); BufferCharCount[KeySegment] = LineLength - PartialLength; } else { memcpy(Buffer[KeySegment]+BufferCharCount[KeySegment], InputLine,LineLength); BufferCharCount[KeySegment] += LineLength; } } for (i = 0; i < BUFCOUNT; i ++) { if (Buffer[i]) { CurrentLength = BufferCharCount[i]; if (CurrentLength > 0) { OutputFile.seekp(TotalDisplacement[i]); OutputFile.write(Buffer[i],CurrentLength); TotalWrites ++; } } } InputFile.close(); OutputFile.close(); sprintf(temp,"Finished distributing on pass %d",PassCount-Pass); timing(temp); } printf("Total writes: %d\n", TotalWrites); end_timing(); return 0; } #include #include #include #include #include #include "e:\opt\common\timings.h" int CalculateKeySegment(char* InputLine) { int KeySegment = 0; for (int i = 0; i < 6; i ++) { KeySegment *= 10; KeySegment += InputLine[i]-'0'; } return KeySegment; } int main(int argc, char *argv[]) { const int KEY_PREFIX_LENGTH = 2; const int MAXPASSCOUNT = 100; const int BUFCOUNT = 1000000; const int TOTAL_BUFFER = 16*1048576; const int INPUTLINESIZE = 1024; char InputLine[INPUTLINESIZE]; int* BufferOffset = new int [BUFCOUNT+1]; int* BufferCharCount = new int[BUFCOUNT]; int KeySegment; char* InputFileName; char* OutputFileName; ifstream InputFile; ofstream OutputFile; int PassCount; int CurrentLength; int NewLength; int LineLength; int TotalKeys = 0; bool StatisticsDisplayed = false; int TotalWrites = 0; int i; int j; double BufferRatio; int PartialLength; int TotalBufferSize; int KeyLength; if (argc < 4) { printf("Usage: zen05 keylength infile outfile\n"); exit(1); } else { KeyLength = atoi(argv[1]); InputFileName = argv[2]; OutputFileName = argv[3]; } char temp[100]; start_timing(); InputFile.open(InputFileName,ios::in|ios::binary); //start counting pass int* BufferCapacity = new int[BUFCOUNT]; for (i = 0; i < BUFCOUNT; i ++) BufferCapacity[i] = 0; for (i = 0; ; i ++) { InputFile.getline(InputLine,INPUTLINESIZE); if (!InputFile) break; TotalKeys ++; LineLength = strlen(InputLine); if (LineLength < KeyLength) { printf("Illegal record: %s",InputLine); exit(1); } KeySegment = CalculateKeySegment(InputLine); BufferCapacity[KeySegment] += LineLength + 1; } int Split[MAXPASSCOUNT]; // possible number of passes int SplitTotalSize[MAXPASSCOUNT]; // bytes per pass int SplitData; int ThisDisplacement; int TotalData = 0; Split[0] = 0; i = 0; for (j = 1; j < MAXPASSCOUNT; j ++) { SplitData = 0; BufferOffset[i] = 0; for (; i < BUFCOUNT; i ++) { ThisDisplacement = BufferCapacity[i]; if (SplitData + ThisDisplacement > TOTAL_BUFFER) break; SplitData += ThisDisplacement; BufferOffset[i+1] = SplitData; } Split[j] = i; SplitTotalSize[j-1] = SplitData; TotalData += SplitData; if (i == BUFCOUNT) break; } delete [] BufferCapacity; PassCount = j; printf("Total buffer space: %d\n",TOTAL_BUFFER); printf("Total keys: %d\n", TotalKeys); printf("Total data: %d\n", TotalData); sprintf(temp,"Finished counting"); timing(temp); OutputFile.open(OutputFileName,ios::out|ios::binary); char* BigBuffer = new char [TOTAL_BUFFER]; for (int Pass = 0; Pass < PassCount ; Pass ++) { for (i = Split[Pass]; i < Split[Pass+1]; i ++) { BufferCharCount[i] = 0; } InputFile.clear(); InputFile.seekg(0); memset(BigBuffer,0,TOTAL_BUFFER); int CompareResult; for (i = 0; ; i ++) { InputFile.getline(InputLine,INPUTLINESIZE); if (!InputFile) break; LineLength = strlen(InputLine)+1; strcpy(InputLine+LineLength-1,"\n"); KeySegment = CalculateKeySegment(InputLine); char* Where; if (KeySegment >= Split[Pass] && KeySegment < Split[Pass+1]) { CurrentLength = BufferCharCount[KeySegment]; char* CurrentPosition = BufferOffset[KeySegment]+BigBuffer; char* EndOfBuffer = CurrentPosition + CurrentLength; for (Where = CurrentPosition; Where < EndOfBuffer;) { CompareResult = memcmp(InputLine,Where,KeyLength); if (CompareResult < 0) { break; } else { while (*(Where++) != '\n') ; } } memmove(Where+LineLength,Where,EndOfBuffer-Where); memcpy(Where,InputLine,LineLength); BufferCharCount[KeySegment] += LineLength; } } OutputFile.write(BigBuffer,SplitTotalSize[Pass]); TotalWrites ++; sprintf(temp,"Finished distributing on pass %d",PassCount-Pass); timing(temp); } InputFile.close(); OutputFile.close(); printf("Total writes: %d\n", TotalWrites); end_timing(); return 0; } #include #include #include #include #include #include "e:\opt\common\timings.h" int CalculateKeySegment(char* InputLine) { int KeySegment = 0; unsigned char LowChar = (InputLine[2] - ' '); unsigned char MiddleChar = InputLine[1] - ' '; unsigned char HighChar = InputLine[0] - ' '; KeySegment = HighChar * 96 * 96 + MiddleChar * 96 + LowChar; return KeySegment; } int main(int argc, char *argv[]) { const int KEY_PREFIX_LENGTH = 2; const int MAXPASSCOUNT = 100; const int BUFCOUNT = 96*96*96; const int TOTAL_BUFFER = 16*1048576; const int INPUTLINESIZE = 1024; char InputLine[INPUTLINESIZE]; int* BufferOffset = new int [BUFCOUNT+1]; int* BufferCharCount = new int[BUFCOUNT]; int KeySegment; char* InputFileName; char* OutputFileName; ifstream InputFile; ofstream OutputFile; int PassCount; int CurrentLength; int NewLength; int LineLength; int TotalKeys = 0; bool StatisticsDisplayed = false; int TotalWrites = 0; int i; int j; double BufferRatio; int PartialLength; int TotalBufferSize; int KeyLength; if (argc < 4) { printf("Usage: zen06 keylength infile outfile\n"); exit(1); } else { KeyLength = atoi(argv[1]); InputFileName = argv[2]; OutputFileName = argv[3]; } char temp[100]; start_timing(); InputFile.open(InputFileName,ios::in|ios::binary); //start counting pass int* BufferCapacity = new int[BUFCOUNT]; for (i = 0; i < BUFCOUNT; i ++) { BufferCapacity[i] = 0; } for (i = 0; ; i ++) { InputFile.getline(InputLine,INPUTLINESIZE); if (!InputFile) break; TotalKeys ++; LineLength = strlen(InputLine); if (LineLength < KeyLength) { printf("Illegal record: %s",InputLine); exit(1); } KeySegment = CalculateKeySegment(InputLine); BufferCapacity[KeySegment] += LineLength + 1; } int Split[MAXPASSCOUNT]; // possible number of passes int SplitTotalSize[MAXPASSCOUNT]; // bytes per pass int SplitData; int ThisDisplacement; int TotalData = 0; Split[0] = 0; i = 0; for (j = 1; j < MAXPASSCOUNT; j ++) { SplitData = 0; BufferOffset[i] = 0; for (; i < BUFCOUNT; i ++) { ThisDisplacement = BufferCapacity[i]; if (SplitData + ThisDisplacement > TOTAL_BUFFER) break; SplitData += ThisDisplacement; BufferOffset[i+1] = SplitData; } Split[j] = i; SplitTotalSize[j-1] = SplitData; TotalData += SplitData; if (i == BUFCOUNT) break; } delete [] BufferCapacity; PassCount = j; printf("Total buffer space: %d\n",TOTAL_BUFFER); printf("Total keys: %d\n", TotalKeys); printf("Total data: %d\n", TotalData); sprintf(temp,"Finished counting"); timing(temp); OutputFile.open(OutputFileName,ios::out|ios::binary); char* BigBuffer = new char [TOTAL_BUFFER]; for (int Pass = 0; Pass < PassCount ; Pass ++) { for (i = Split[Pass]; i < Split[Pass+1]; i ++) { BufferCharCount[i] = 0; } InputFile.clear(); InputFile.seekg(0); memset(BigBuffer,0,TOTAL_BUFFER); int CompareResult; for (i = 0; ; i ++) { InputFile.getline(InputLine,INPUTLINESIZE); if (!InputFile) break; LineLength = strlen(InputLine)+1; strcpy(InputLine+LineLength-1,"\n"); KeySegment = CalculateKeySegment(InputLine); char* Where; if (KeySegment >= Split[Pass] && KeySegment < Split[Pass+1]) { CurrentLength = BufferCharCount[KeySegment]; char* CurrentPosition = BufferOffset[KeySegment]+BigBuffer; char* EndOfBuffer = CurrentPosition + CurrentLength; for (Where = CurrentPosition; Where < EndOfBuffer;) { CompareResult = memcmp(InputLine,Where,KeyLength); if (CompareResult < 0) { break; } else { while (*(Where++) != '\n') ; } } memmove(Where+LineLength,Where,EndOfBuffer-Where); memcpy(Where,InputLine,LineLength); BufferCharCount[KeySegment] += LineLength; } } OutputFile.write(BigBuffer,SplitTotalSize[Pass]); TotalWrites ++; sprintf(temp,"Finished distributing on pass %d",PassCount-Pass); timing(temp); } InputFile.close(); OutputFile.close(); printf("Total writes: %d\n", TotalWrites); end_timing(); return 0; } #include #include #include #include #include #include "e:\opt\common\timings.h" int CalculateKeySegment(char* InputLine) { int KeySegment = 0; unsigned char LowChar = (InputLine[2] - ' '); unsigned char MiddleChar = InputLine[1] - ' '; unsigned char HighChar = InputLine[0] - ' '; KeySegment = HighChar * 96 * 96 + MiddleChar * 96 + LowChar; return KeySegment; } int main(int argc, char *argv[]) { const int KEY_PREFIX_LENGTH = 2; const int MAXPASSCOUNT = 100; const int BUFCOUNT = 96*96*96; const int TOTAL_BUFFER = 16*1048576; const int INPUTLINESIZE = 100; char InputLine[INPUTLINESIZE]; int* BufferOffset = new int [BUFCOUNT+1]; int* BufferCharCount = new int[BUFCOUNT]; int KeySegment; char* InputFileName; char* OutputFileName; ifstream InputFile; ofstream OutputFile; int PassCount; int CurrentLength; int NewLength; int LineLength; int TotalKeys = 0; bool StatisticsDisplayed = false; int TotalWrites = 0; int i; int j; double BufferRatio; int PartialLength; int TotalBufferSize; int KeyLength; if (argc < 4) { printf("Usage: zen07 keylength infile outfile\n"); exit(1); } else { KeyLength = atoi(argv[1]); InputFileName = argv[2]; OutputFileName = argv[3]; } char temp[100]; start_timing(); InputFile.open(InputFileName,ios::in|ios::binary); //start counting pass int* BufferCapacity = new int[BUFCOUNT]; for (i = 0; i < BUFCOUNT; i ++) { BufferCapacity[i] = 0; } for (i = 0; ; i ++) { InputFile.read(InputLine,INPUTLINESIZE); if (!InputFile) break; TotalKeys ++; LineLength = INPUTLINESIZE; if (LineLength < KeyLength) { printf("Illegal record: %s",InputLine); exit(1); } KeySegment = CalculateKeySegment(InputLine); BufferCapacity[KeySegment] += LineLength; } int Split[MAXPASSCOUNT]; // possible number of passes int SplitTotalSize[MAXPASSCOUNT]; // bytes per pass int SplitData; int ThisDisplacement; int TotalData = 0; Split[0] = 0; i = 0; for (j = 1; j < MAXPASSCOUNT; j ++) { SplitData = 0; BufferOffset[i] = 0; for (; i < BUFCOUNT; i ++) { ThisDisplacement = BufferCapacity[i]; if (SplitData + ThisDisplacement > TOTAL_BUFFER) break; SplitData += ThisDisplacement; BufferOffset[i+1] = SplitData; } Split[j] = i; SplitTotalSize[j-1] = SplitData; TotalData += SplitData; if (i == BUFCOUNT) break; } delete [] BufferCapacity; PassCount = j; printf("Total buffer space: %d\n",TOTAL_BUFFER); printf("Total keys: %d\n", TotalKeys); printf("Total data: %d\n", TotalData); sprintf(temp,"Finished counting"); timing(temp); OutputFile.open(OutputFileName,ios::out|ios::binary); char* BigBuffer = new char [TOTAL_BUFFER]; memset(BigBuffer,0,TOTAL_BUFFER); for (int Pass = 0; Pass < PassCount ; Pass ++) { for (i = Split[Pass]; i < Split[Pass+1]; i ++) { BufferCharCount[i] = 0; } InputFile.clear(); InputFile.seekg(0); int CompareResult; for (i = 0; ; i ++) { InputFile.read(InputLine,INPUTLINESIZE); if (!InputFile) break; TotalKeys ++; LineLength = INPUTLINESIZE; KeySegment = CalculateKeySegment(InputLine); char* Where; if (KeySegment >= Split[Pass] && KeySegment < Split[Pass+1]) { CurrentLength = BufferCharCount[KeySegment]; char* CurrentPosition = BufferOffset[KeySegment]+BigBuffer; char* EndOfBuffer = CurrentPosition + CurrentLength; for (Where = CurrentPosition; Where < EndOfBuffer; Where += LineLengt { CompareResult = memcmp(InputLine,Where,KeyLength); if (CompareResult < 0) break; } memmove(Where+LineLength,Where,EndOfBuffer-Where); memcpy(Where,InputLine,LineLength); BufferCharCount[KeySegment] += LineLength; } } OutputFile.write(BigBuffer,SplitTotalSize[Pass]); TotalWrites ++; sprintf(temp,"Finished distributing on pass %d",PassCount-Pass); timing(temp); } InputFile.close(); OutputFile.close(); printf("Total writes: %d\n", TotalWrites); end_timing(); return 0; } ... save storage by using a restricted character set and how to speed up access to records by employing hash coding (or "scatter storage") and caching (or keeping copies of recently accessed records in memory)... However, only some direct access devices allow nonsequential accesses without a significant time penalty; these are called random access devices Unfortunately, disk drives are direct access devices, but not random access ones... Average number of accesses per record = 12.3631 accesses/record Figure binary.search shows the calculation of the average number of accesses for a 10,000 item file Notice that each line represents twice the number of records

Ngày đăng: 26/03/2019, 17:13

Mục lục

  • Cover

  • Introduction

  • Contents

  • Dedication

  • Acknowledgements

  • Prologue

  • A Supermarket Price Lookup System

  • A Mailing List System

  • Cn U Rd Ths (Qkly)? A Data Compression Utility

  • Free at Last: An Efficient Method of Handling Variable-Length Records

  • Heavenly Hash: A Dynamic Hashing Algorithm

  • Zensort: A Sorting Algorithm for Limited Memory

  • Mozart, No. Would You Believe Gershwin?

  • About the Author

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan