Now that the processing has been carried out i wish to merge the OCR files so that there are only the number of files which correspond to the original documents.
The text files are stored in the parent folder as follows:
ts001.txt ts002.txt ts003.txt ts004.txt ts00*.txt ts050.txt
Also stored within the folder is a csv file which contains markers which indicate the start of a new document.
ts001.txt Y ts002.txt ts003.txt Y ts004.txt ts005.txt ts006.txt Y ts007.txt
Where the finished combined files should be:
ts001.txt = ts001.txt + ts002.txt
ts003.txt = ts003.txt + ts004.txt + ts005.txt
ts006.txt = ts006.txt + ts007.txt
I am aware that a csv file can be read in using powershell allowing its columns to be referenced which would allow me to access the document head colomn with the Ys in it. however as there is no closing value to indicate the end of the document i am unsure how to group the documents which contain the original document information in order to merge them.
I understand the principal of merging text documents through powershell also it is the grouping of documents which is giving me trouble
Any help would be greatly appreciated
Regards, Craig

New Topic/Question
Reply




MultiQuote



|