8 Replies - 268 Views - Last Post: 02 June 2020 - 04:42 PM Rate Topic: -----

#1 leace   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 18
  • Joined: 26-May 20

Multiple system.out.println error

Posted 02 June 2020 - 01:38 PM

Text is extracted from pdf documents. Text extracted using pattern and matcher. Issue here is that code is working fine only if all line(System.out.println) groups are producing the results from Line1 to Line12.

Output stop printing further if that last print line has no results.

Similarly below code output shows only Line1 result and its not printing result further since line2 is not matching with any results.

As the system.out.prinln displayed below . Line 1 is matching the data therefore only result displayed in the output for Line1 and rest all showing zero results though there is data matches with Line5,Line8,Line10,Line12

Existing code :
 
package edfboxredfromfile;

import java.awt.geom.Rectangle2D;
import java.io.File;
import java.io.IOException;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.interactive.form.PDAcroForm;
import org.apache.pdfbox.pdmodel.interactive.form.PDField;
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.pdfbox.text.PDFTextStripperByArea;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class mk {
  public static void main(String[] args) {

    try {

      File file = new File("C:/Users/LPO20.pdf");

      PDDocument doc = PDdocument.load(file);
      PDFTextStripper pdfTextStripper = new PDFTextStripper();
      pdfTextStripper.setSortByPosition(true);
      pdfTextStripper.setStartPage(1);
      pdfTextStripper.setEndPage(6);
      String text = pdfTextStripper.getText(doc);
      String pattern = "(.*)(\\d+)(.*)";
      Pattern p = Pattern.compile("PO...........*?");
      Pattern p1 = Pattern.compile("Vendor...........");
      Pattern p2 = Pattern.compile("100.....*?");
      Pattern p3 = Pattern.compile("EWRRR.........................*?");
      Pattern p4 = Pattern.compile("(?<=Date...................................:)/>/>.*");
      Pattern p9 = Pattern.compile("(?<=5449000165336........)...................");
      Pattern p10 = Pattern.compile("(?<=5449000165336...........................).....");
      Pattern p11 = Pattern.compile("(?<=5449000105394........)....................");
      Pattern p12 = Pattern.compile("(?<=5449000105394.....................).....", Pattern.DOTALL);
      Pattern p5 = Pattern.compile("544...........3*?");
      Pattern p6 = Pattern.compile("544900010.....*?");

      Pattern p7 = Pattern.compile("627101499504..*?");
      Pattern p8 = Pattern.compile("(.*)(\\d)(.*)", Pattern.MULTILINE);
      Pattern p13 = Pattern.compile("5449000138....*?");
      Pattern p14 = Pattern.compile("(?<=5449000138514........)...................");
      Pattern p15 = Pattern.compile("(?<=5449000138514..............................).....");
      Pattern p16 = Pattern.compile("5449000132....*?");
      Pattern p17 = Pattern.compile("(?<=5449000132673........)............");
      Pattern p18 = Pattern.compile("(?<=5449000132673....................)......");
      Pattern p19 = Pattern.compile("5449000150....*?");
      Pattern p20 = Pattern.compile("(?<=5449000150257........)...................");
      Pattern p21 = Pattern.compile("(?<=5449000150257..........................)......");
      Pattern p22 = Pattern.compile("5449000039....*?");
      Pattern p23 = Pattern.compile("(?<=5449000039880........)...................");
      Pattern p24 = Pattern.compile("(?<=5449000039880..........................)......");
      Pattern p25 = Pattern.compile("544900008399..*?");
      Pattern p26 = Pattern.compile("(?<=5449000083999.......)...................");
      Pattern p27 = Pattern.compile("(?<=5449000083999.........................).......");
      Pattern p28 = Pattern.compile("5449000083982..*?");
      Pattern p29 = Pattern.compile("(?<=5449000083982.......)...................");
      Pattern p30 = Pattern.compile("(?<=5449000083982..........................)......");
      Pattern p31 = Pattern.compile("5449000168436..*?");
      Pattern p32 = Pattern.compile("(?<=5449000168436.......)...................");
      Pattern p33 = Pattern.compile("(?<=5449000168436..........................)......");
      Pattern p34 = Pattern.compile("5449000168443..*?");
      Pattern p35 = Pattern.compile("(?<=5449000168443.......)...................");
      Pattern p36 = Pattern.compile("(?<=5449000168443.............................)......");
      Pattern p37 = Pattern.compile("5449000088444..*?");
      Pattern p38 = Pattern.compile("(?<=5449000088444.......)...................");
      Pattern p39 = Pattern.compile("(?<=5449000088444............................).....");
      Pattern p40 = Pattern.compile("5449000271921..*?");
      Pattern p41 = Pattern.compile("(?<=5449000271921.......)...................");
      Pattern p42 = Pattern.compile("(?<=5449000271921............................).....");
      Matcher m = p.matcher(text);
      Matcher m1 = p1.matcher(text);
      Matcher m2 = p2.matcher(text);
      Matcher m3 = p3.matcher(text);
      Matcher m4 = p4.matcher(text);
      Matcher m5 = p5.matcher(text);
      Matcher m6 = p6.matcher(text);
      Matcher m7 = p7.matcher(text);
      Matcher m8 = p8.matcher(text);
      Matcher m9 = p9.matcher(text);
      Matcher m10 = p10.matcher(text);
      Matcher m11 = p11.matcher(text);
      Matcher m12 = p12.matcher(text);
      Matcher m13 = p13.matcher(text);
      Matcher m14 = p14.matcher(text);
      Matcher m15 = p15.matcher(text);
      Matcher m16 = p16.matcher(text);
      Matcher m17 = p17.matcher(text);
      Matcher m18 = p18.matcher(text);
      Matcher m19 = p19.matcher(text);
      Matcher m20 = p20.matcher(text);
      Matcher m21 = p21.matcher(text);
      Matcher m22 = p22.matcher(text);
      Matcher m23 = p23.matcher(text);
      Matcher m24 = p24.matcher(text);
      Matcher m25 = p25.matcher(text);
      Matcher m26 = p26.matcher(text);
      Matcher m27 = p27.matcher(text);
      Matcher m28 = p28.matcher(text);
      Matcher m29 = p29.matcher(text);
      Matcher m30 = p30.matcher(text);
      Matcher m31 = p31.matcher(text);
      Matcher m32 = p32.matcher(text);
      Matcher m33 = p33.matcher(text);
      Matcher m34 = p34.matcher(text);
      Matcher m35 = p35.matcher(text);
      Matcher m36 = p36.matcher(text);
      Matcher m37 = p37.matcher(text);
      Matcher m38 = p38.matcher(text);
      Matcher m39 = p39.matcher(text);
      Matcher m40 = p40.matcher(text);
      Matcher m41 = p41.matcher(text);
      Matcher m42 = p42.matcher(text);
      m.find();
      m1.find();
      m2.find();
      m3.find();
      m4.find();
      m5.find();
      m6.find();
      m7.find();
      m8.find();
      m9.find();
      m10.find();
      m11.find();
      m12.find();
      m13.find();
      m14.find();
      m15.find();
      m16.find();
      m17.find();
      m18.find();
      m19.find();
      m20.find();
      m21.find();
      m22.find();
      m23.find();
      m24.find();
      m25.find();
      m26.find();
      m27.find();
      m28.find();
      m29.find();
      m30.find();
      m31.find();
      m32.find();
      m33.find();
      m34.find();
      m35.find();
      m36.find();
      m37.find();
      m38.find();
      m39.find();
      m40.find();
      m41.find();
      m42.find();

System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|" + m5.group(0) + "|"+ m9.group(0) + "|"+ m10.group(0) + "|" );
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m6.group(0) + "|"+ m11.group(0) + "|"+ m12.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m13.group(0) + "|" + m14.group(0) + "|"+ m15.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m16.group(0) + "|" + m17.group(0) + "|"+ m18.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m19.group(0) + "|" + m20.group(0) + "|"+ m21.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m22.group(0) + "|" + m23.group(0) + "|"+ m24.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m25.group(0) + "|" + m26.group(0) + "|"+ m27.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m28.group(0) + "|" + m29.group(0) + "|"+ m30.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m31.group(0) + "|" + m32.group(0) + "|"+ m33.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m34.group(0) + "|" + m35.group(0) + "|"+ m36.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m37.group(0) + "|" + m38.group(0) + "|"+ m39.group(0) + "|");
System.out.println( m.group(0) + "|"+ m1.group(0) + "|"+ m2.group(0)+ "|"+ m3.group(0) + "|"+ "erf"+ "|"+ m4.group(0) + "|"+  m40.group(0) + "|" + m41.group(0) + "|"+ m42.group(0) + "|");

      doc.close();
    } catch (IOException e) {
      e.printStackTrace();
    }
  }
}  





Current output: PO-004343434|Vendor : TL-00054|100203 |PINGO |HENKA| 5/10/2020|5449000165336 |LOHUI| 10.0|
Expected output:PO-004343434|Vendor : TL-00054|500206 |PINGO |HENKA| 5/10/2020|5449000165336 |LOHUI| 10.0|
PO-004343434|Vendor : TL-00054|500206 |PINGO |HENKA| 5/10/2020|5449000165345 |LOHUI| 30.0|

PO-004343434|Vendor : TL-00054|500206 |PINGO |HENKA| 5/10/2020|5449000165367 |LOHUI| 20.0|

PO-004343434|Vendor : TL-00054|500206 |PINGO |HENKA| 5/10/2020|5449000165388 |LOHUI| 40.0|

This post has been edited by leace: 02 June 2020 - 01:48 PM


Is This A Good Question/Topic? 0
  • +

Replies To: Multiple system.out.println error

#2 NormR   User is online

  • D.I.C Lover
  • member icon

Reputation: 832
  • View blog
  • Posts: 6,392
  • Joined: 25-December 13

Re: Multiple system.out.println error

Posted 02 June 2020 - 01:44 PM

Quote

Output stop printing further if that last print line has no results.

That can happen if the code tries to print a binary 0. Not sure what happens, but future print statements are not displayed.
The solution is to make sure the Strings being printed do not contain any low binary values line less than 0x0D.
Was This Post Helpful? 0
  • +
  • -

#3 leace   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 18
  • Joined: 26-May 20

Re: Multiple system.out.println error

Posted 02 June 2020 - 01:46 PM

View PostNormR, on 02 June 2020 - 01:44 PM, said:

Quote

Output stop printing further if that last print line has no results.

That can happen if the code tries to print a binary 0. Not sure what happens, but future print statements are not displayed.
The solution is to make sure the Strings being printed do not contain any low binary values line less than 0x0D.



Thank you . Future print is edited. what could be missing in this to avoid this issue ?

This post has been edited by leace: 02 June 2020 - 01:51 PM

Was This Post Helpful? 0
  • +
  • -

#4 NormR   User is online

  • D.I.C Lover
  • member icon

Reputation: 832
  • View blog
  • Posts: 6,392
  • Joined: 25-December 13

Re: Multiple system.out.println error

Posted 02 June 2020 - 01:53 PM

Quote

to avoid this issue ?

Please explain what the issue is.
Was This Post Helpful? 0
  • +
  • -

#5 leace   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 18
  • Joined: 26-May 20

Re: Multiple system.out.println error

Posted 02 June 2020 - 02:01 PM

View PostNormR, on 02 June 2020 - 01:53 PM, said:

Quote

to avoid this issue ?

Please explain what the issue is.


lets say i have 3 System.out.print command example : system.out.println(Test1); system.out.println(Test2);system.out.println(Test3);

Issue here is if suppose third line has matched data but it will not display since first line and second line does not have matching data.
same time if suppose first line,second line has matched data then it display the results for first line ,secondline
Was This Post Helpful? 0
  • +
  • -

#6 NormR   User is online

  • D.I.C Lover
  • member icon

Reputation: 832
  • View blog
  • Posts: 6,392
  • Joined: 25-December 13

Re: Multiple system.out.println error

Posted 02 June 2020 - 02:04 PM

// i have 3 System.out.print command example : 
system.out.println(Test1); 
system.out.println(Test2);
system.out.println(Test3);




Those three print statements as coded should print 3 lines with the contents of the Test variables.

I don't see why they all would not execute as expected.

Quote

if suppose third line has matched data

What does that have to do with the 3 print statements?

This post has been edited by NormR: 02 June 2020 - 02:05 PM

Was This Post Helpful? 0
  • +
  • -

#7 leace   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 18
  • Joined: 26-May 20

Re: Multiple system.out.println error

Posted 02 June 2020 - 02:09 PM

View PostNormR, on 02 June 2020 - 02:04 PM, said:

// i have 3 System.out.print command example : 
system.out.println(Test1); 
system.out.println(Test2);
system.out.println(Test3);




Those three print statements as coded should print 3 lines with the contents of the Test variables.

I don't see why they all would not execute as expected.

Quote

if suppose third line has matched data

What does that have to do with the 3 print statements?


Hi as i mentioned if line1,line2,line3 all has data matches then it display correctly. if suppose only line3 is having data matches then results shows none since line1,line2 does not have any data matches. infact line3 should display the results since it has matched data and sametime if you remove line1,line2 then line3 is displayed correctly
Was This Post Helpful? 0
  • +
  • -

#8 NormR   User is online

  • D.I.C Lover
  • member icon

Reputation: 832
  • View blog
  • Posts: 6,392
  • Joined: 25-December 13

Re: Multiple system.out.println error

Posted 02 June 2020 - 02:14 PM

Can you make a small, simple program that compiles and executes and shows the problem?
Have all the Strings and input declared in the posted code. Don't require extra files.

Note: The code should receive and use the value returned by the find method.

This post has been edited by NormR: 02 June 2020 - 02:34 PM

Was This Post Helpful? 0
  • +
  • -

#9 g00se   User is offline

  • D.I.C Lover
  • member icon

Reputation: 3698
  • View blog
  • Posts: 16,952
  • Joined: 20-September 08

Re: Multiple system.out.println error

Posted 02 June 2020 - 04:42 PM

Matcher.find is a boolean method. You're ignoring the returned value. Paying attention to it and not concatenating when it returns false will solve your problem
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1