I recently received a a large text file (many megabytes) which was a concatenation of many different smaller files. The smaller files where of different size and the separated by a particular character. I did a quick Google search, but almost all of the file splitting applications available split files based on size.
I needed to split a file based on a certain ASCII character.
Written using C#, my text file splitting app is extremely simple.
Step 1:
Read the entire text file into a string variable
- #6666cc; font-weight: bold;">string file #008000;">= tr#008000;">.#0000FF;">ReadToEnd#008000;">(#008000;">)#008000;">;
Step 2:
Using the string function 'Split', create a string array.
- #6666cc; font-weight: bold;">string#008000;">[#008000;">] files #008000;">= file#008000;">.#0000FF;">Split#008000;">(splitChar#008000;">)#008000;">;
Step 3:
Iterate through the string array, and write each to a separate file.
- #0600FF; font-weight: bold;">foreach #008000;">(#6666cc; font-weight: bold;">string file #0600FF; font-weight: bold;">in files#008000;">)
- #008000;">{
- #0600FF; font-weight: bold;">if #008000;">(file#008000;">.#0000FF;">Length #008000;">!= #FF0000;">0#008000;">)
- #008000;">{
- #0600FF; font-weight: bold;">if #008000;">(CreateFile#008000;">(file, #6666cc; font-weight: bold;">string#008000;">.#0000FF;">Format#008000;">(#666666;">"{0}#008080; font-weight: bold;">\\output_{1}.{2}", outputDir, _writtenCount#008000;">.#0000FF;">ToString#008000;">(#008000;">), fileExt#008000;">)#008000;">)#008000;">)
- #008000;">{
- _writtenCount#008000;">++;
- #008000;">}
- #0600FF; font-weight: bold;">else
- #008000;">{
- _errorCount#008000;">++;
- #008000;">}
- #008000;">}
- #0600FF; font-weight: bold;">else
- #008000;">{
- _errorCount#008000;">++;
- #008000;">}
- #008000;">}
That's it!!!
While there is some beauty in simplistic design, the differentiating aspect of this application is the status screen.
All of the file reading\splitting\writing is contained within the class 'Splitter'. I have tried to keep it independent of the application so that
it could be re-used within other applications if needed. Thus it cannot know about the application, it's forms or controls.
However, I do want my applications (that use this class) to provide the user some feedback on how the splitting process is proceeding. I have achieved this by raising 'events' within the splitter class.
First of all the event delegate's need to be defined, they are defined globally but still within the 'splitter' class.
- #0600FF; font-weight: bold;">public #6666cc; font-weight: bold;">delegate #6666cc; font-weight: bold;">void FileProcessedHandler#008000;">(#6666cc; font-weight: bold;">object sender#008000;">)#008000;">;
- #0600FF; font-weight: bold;">public #6666cc; font-weight: bold;">delegate #6666cc; font-weight: bold;">void FileWrittenHandler#008000;">(#6666cc; font-weight: bold;">object sender#008000;">)#008000;">;
- #0600FF; font-weight: bold;">public #6666cc; font-weight: bold;">delegate #6666cc; font-weight: bold;">void FileCompleteHandler#008000;">(#6666cc; font-weight: bold;">object sender#008000;">)#008000;">;
Then the events need to be declared within the class.
- #0600FF; font-weight: bold;">public #0600FF; font-weight: bold;">event FileProcessedHandler FileProcessedEvent#008000;">;
- #0600FF; font-weight: bold;">public #0600FF; font-weight: bold;">event FileWrittenHandler FileWrittenEvent#008000;">;
- #0600FF; font-weight: bold;">public #0600FF; font-weight: bold;">event FileCompleteHandler FileCompleteEvent#008000;">;
Then the events can be raised within the code of the class.
- #0600FF; font-weight: bold;">public #6666cc; font-weight: bold;">void ProcessFile#008000;">(#6666cc; font-weight: bold;">string filename, #6666cc; font-weight: bold;">char splitCharacter, #6666cc; font-weight: bold;">string outputDirectory, #6666cc; font-weight: bold;">string outputExtension#008000;">)
- #008000;">{
- #6666cc; font-weight: bold;">string file #008000;">= ReadFile#008000;">(filename#008000;">)#008000;">;
- #6666cc; font-weight: bold;">string#008000;">[#008000;">] splitFiles #008000;">= SplitFile#008000;">(file, splitCharacter#008000;">.#0000FF;">ToString#008000;">(#008000;">)#008000;">.#0000FF;">ToCharArray#008000;">(#008000;">)#008000;">)#008000;">;
- _fileCount #008000;">= splitFiles#008000;">.#0000FF;">Length#008000;">;
- FileProcessedEvent#008000;">(#0600FF; font-weight: bold;">this#008000;">)#008000;">;
- SaveFiles#008000;">(splitFiles, outputDirectory, outputExtension#008000;">)#008000;">;
- FileCompleteEvent#008000;">(#0600FF; font-weight: bold;">this#008000;">)#008000;">;
- #008000;">}
If you would like to download the application, or the full source code, please go to the Text File Splitter Project.
Trackback URL for this post:
- dgrinberg's blog
- Login or register to post comments