Subscribe to Stuck in an Infiniteloop        RSS Feed

Apache commons-io and filesystems

Icon Leave Comment
Software is hard. When writing a Java program, you're already so far up the Jenga tower that is a modern computing system that when a lower brick causes you to question your sanity, it's something else.

It is possible for the file metadata to be updated before the file payload is updated accordingly. Let that sink in.

Consider a program's log file that is constantly being appended, rotated, etc... The Apache commons-io Tailer object can be constructed in such a way that it will continually reset itself to the head of the file because it thinks it has been truncated or overwritten with the exact same data. If you pass in a zero second delay (tail's default is 1.0s as is the Tailer's) it may read that the file has been updated but by the time it gets to the length check, it is the same and resets to the beginning of the file.

            while (getRun()) {
                final boolean newer = FileUtils.isFileNewer(file, last); // IO-279, must be done first
                // Check the file length to see if it was rotated
                final long length = file.length();
                if (length < position) {
                    // File was rotated
                    // Reopen the reader after rotation
                    try {
                        // Ensure that the old file is closed iff we re-open it successfully
                        final RandomAccessFile save = reader;
                        reader = new RandomAccessFile(file, RAF_MODE);
                        // At this point, we're sure that the old file is rotated
                        // Finish scanning the old file and then we'll start with the new one
                        try {
                        }  catch (IOException ioe) {
                        position = 0;
                        // close old file explicitly rather than relying on GC picking up previous RAF
                    } catch (final FileNotFoundException e) {
                        // in this case we continue to use the previous reader and position values
                } else {
                    // File was not rotated
                    // See if the file needs to be read again
                    if (length > position) {
                        // The file has more content than it did last time
                        position = readLines(reader);
                        last = file.lastModified();
                    } else if (newer) {
                         * This can happen if the file is truncated or overwritten with the exact same length of
                         * information. In cases like this, the file position needs to be reset
                        position = 0;
              ; // cannot be null here

                        // Now we can read new lines
                        position = readLines(reader);
                        last = file.lastModified();

We enter the else block if the length is not less than position indicating a file rotation. If the length is not greater, indicating more lines to process, it assumes that the file must be truncated or overwritten to the exact size. Moving from a zero time delay to 10 milliseconds alleviated this problem. I suppose the lesson here is to not assume atomic operations on your file system OR just use the defaults? Happy coding!

0 Comments On This Entry


May 2022

2223 24 25262728


    Recent Entries

    Recent Comments

    Search My Blog

    1 user(s) viewing

    1 Guests
    0 member(s)
    0 anonymous member(s)