Raminator's Profile User Rating: ***--

Reputation: 1 Apprentice
Group:
Active Members
Active Posts:
293 (0.22 per day)
Joined:
16-July 12
Profile Views:
8,600
Last Active:
User is offline Nov 20 2015 06:14 PM
Currently:
Offline

Previous Fields

Country:
BR
OS Preference:
Linux
Favorite Browser:
Chrome
Favorite Processor:
AMD
Favorite Gaming Platform:
PC
Your Car:
Who Cares
Dream Kudos:
0

Latest Visitors

Icon   Raminator meh.

Posts I've Made

  1. In Topic: Suggestions on code improvement

    Posted 20 Aug 2015

    Fixed the bug, it was a double bug both in copy_file and copy_tree. I fixed it but I'm very unhappy still with the way copy_tree is being structured, and I can't see how to improve it. Does anyone have suggestions towards making copy_tree less workaround-ish?
    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    import shutil
    import os
    
    from mutagen.flac import FLAC  # Used for metadata handling.
    from os import listdir  # Used for general operations.
    from fuzzywuzzy import fuzz  # Last resource name association.
    # Insert here the root directory of your library and device respectively.
    lib = 'C:/Users/berna/Desktop/Lib/'
    dev = 'C:/Users/berna/Desktop/Dev/'
    
    
    # Faster file copying function, arguments go as follows: Source file location,
    # target directory, whether to keep the filename intact and whether to create
    # the target directory in case it doesn't exist.
    def copy_file(SrcFile, TgtDir, KeepName=True):
        SourceFile = None
        TargetFile = None
        KeepGoing = False
    
        # Processes TgtDir depending on filename choice.
        if KeepName:
            TgtDir += os.path.basename(SrcFile)
            print(TgtDir)
        try:
            SourceFile = open(SrcFile, 'rb')
            TargetFile = open(TgtDir, 'wb')
            KeepGoing = True
            Count = 0
            while KeepGoing:
                # Read blocks of size 2**20 = 1048576
                Buffer = SourceFile.read(2 ** 20)
                if not Buffer:
                    break
                TargetFile.write(Buffer)
                Count += len(Buffer)
        finally:
            if TargetFile:
                TargetFile.close()
            if SourceFile:
                SourceFile.close()
        return KeepGoing
    
    
    # XXX Ugly workaround-ish, needs improvement.
    # Copies a directory (SrcDir) to TgtDir
    def copy_tree(SrcDir, TgtDir, Replace=True, Repeated=False):
        # Checks if function is being used for self-recursiveness.
        if not Repeated:
            # If not handles folder naming (check function call to understand this
            # is a weird workaround that just happened to work and I ain't touching
            # it no more.
            TgtDir = format_dir(TgtDir, os.path.basename(SrcDir.rstrip("/")))
        if not os.path.isdir(TgtDir):
            os.makedirs(TgtDir)
        # Makes Subs as all files and folders in the folder to be copied.
        Subs = listdir(SrcDir)
        Errors = []
        # For every subdirectory/file in Source
        for Sub in Subs:
            # Process filenames to remove last "/" and pick Src/Sub and Tgt/Sub
            SrcName = format_dir(SrcDir, Sub).rstrip("/")
            TgtName = format_dir(TgtDir, Sub).rstrip("/")
            # Using try because file operatins are nasty business.
            try:
                # If it's a dir inside a dir we use this very function to copy it,
                # recursiveness at it's best. Can cause a whole bunch of weird
                # behaviour, don't mess with it too much.
                if os.path.isdir(SrcName):
                    copy_tree(SrcName, TgtName, Repeated=True)
                # If it's just a file call copy_file and that's it, KeepName is set
                # as false because it's function is being done by the very logic
                # of this for loop.
                else:
                    copy_file(SrcName, TgtName, KeepName=False)
            # If things get kinky grab the error and report.
            except (IOError, os.error) as why:
                Errors.append((SrcName, TgtName, str(why)))
            if Errors:
                raise Exception(Errors)
    
    
    # Checks for new and deleted folders and returns their name.
    def check_folder(SrcDir, TgtDir):
        # Lists Source and Target folder.
        Source = listdir(SrcDir)
        Target = listdir(TgtDir)
        # Then creates a list of deprecated and new directories.
        Deleted = [FileName for FileName in Target if FileName not in Source]
        Added = [FileName for FileName in Source if FileName not in Target]
        # Returns both lists.
        return (Added, Deleted)
    
    
    # Checks for song in case there's a name mismatch or missing file.
    def check_song(SrcFile, TgtDir):
        Matches = []
        # Invariably the new name will be that of the source file, the issue here
        # is finding which song is the correct one.
        NewName = TgtDir + '/' + os.path.basename(SrcFile)
        TagSource = FLAC(SrcFile)
        # Grabs the number of samples in the original file.
        SourceSamples = TagSource.info.total_samples
        # Checks if any song has a matching sample number and if true appends the
        # song's filename to Matches[]
        for Song in listdir(TgtDir):
            SongInfo = FLAC(TgtDir + '/' + Song)
            if (SongInfo.info.total_samples == SourceSamples):
                Matches.append(Song)
        # If two songs have the same sample rate (44100Hz for CDs) and the same
        # length it matches them to the source by filename similarity.
        if (Matches.count > 1):
            Diffs = []
            for Song in Matches:
                Diffs.append(fuzz.ratio(Song, os.path.basename(SrcFile)))
            if (max(Diffs) > 0.8):
                BestMatch = TgtDir + '/' + Matches[Diffs.index(max(Diffs))]
                os.rename(BestMatch, NewName)
            else:
                shutil.copy(SrcFile, TgtDir)
        # If there's no match at all simply copy over the missing file.
        elif (Matches.count == 0):
            shutil.copy(SrcFile, TgtDir)
        # If a single match is found the filename will be the first item on the
        # Matches[] list.
        else:
            os.rename(TgtDir + '/' + Matches[0], NewName)
    
    
    # Syncs folders in a directory and return the change count.
    def sync(SrcDir, TgtDir):
        AddCount = 0
        DeleteCount = 0
        # Grabs the folders to be added and deleted.
        NewDir, OldDir = check_folder(SrcDir, TgtDir)
        # Checks if any and then does add/rm.
        if OldDir:
            for Folder in OldDir:
                shutil.rmtree(TgtDir + Folder)
                DeleteCount += 1
        if NewDir:
            for Folder in NewDir:
                copy_tree(format_dir(SrcDir, Folder), TgtDir)
                AddCount += 1
        return(AddCount, DeleteCount)
    
    
    # Fixes missing metadata fields.
    def fix_metadata(SrcFile, TgtFile):
        TagSource = FLAC(TgtFile)
        TagTarget = FLAC(SrcFile)
        # Checks for deleted tags on source file and deletes them from target.
        if (set(TagTarget) - set(TagSource)):
            OldTags = list(set(TagTarget) - set(TagSource))
            for Tag in OldTags:
                # TODO Right now I haven't quite figured out how to delete
                # specific tags, so workaround is to delete them all.
                TagTarget.delete()
        # Checks for new tags on source file and transfers them to target.
        if (set(TagSource) != set(TagTarget)):
            NewTags = list(set(TagSource) - set(TagTarget))
            for Tag in NewTags:
                TagTarget["%s" % Tag] = TagSource[Tag]
                TagTarget.save(TgtFile)
    
    
    # Does metadata transfer between two files.
    def match_metadata(SrcFile, TgtFile):
        Altered = 0
        TagSource = FLAC(SrcFile)
        TagTarget = FLAC(TgtFile)
        # For every different Tag in source song copy it to target and save.
        for Tag in TagSource:
            if TagSource[Tag] != TagTarget[Tag]:
                Altered += 1
                TagTarget[Tag] = TagSource[Tag]
                TagTarget.save(TgtFile)
        return(Altered)
    
    
    # Simply does directory formatting to make things easier.
    def format_dir(Main, Second, Third=""):
        # Replaces \ with /
        Main = Main.replace('\\', '/')
        # Adds a / to the end of Main and concatenates Main and Second.
        if(Main[len(Main) - 1] != '/'):
            Main += '/'
        Main += Second + '/'
        # Concatenates Main and Third if necessary.
        if (Third):
            Main += Third + '/'
        return (Main)
    
    # Sync main folders in lib with dev.
    sync(lib, dev)
    # For every Artist in lib sync it's Albums
    for Artist in listdir(lib):
        sync(format_dir(lib, Artist), format_dir(dev, Artist))
        # For every Album in Artist match songs
        for Album in listdir(format_dir(lib, Artist)):
            # Declares lib Album and dev Album to make function calls shorter.
            CurrentAlbum = format_dir(lib, Artist, Album)
            CoAlbum = format_dir(dev, Artist, Album)
            for Song in listdir(CurrentAlbum):
                if (".flac" in Song or ".FLAC" in Song):
                    try:
                        # Tries to match lib and dev song's metadata.
                        match_metadata(CurrentAlbum + Song, CoAlbum + Song)
                    except:
                        # If that fails will try to fix both Filename and Tag
                        # fields.
                        check_song(CurrentAlbum + Song, CoAlbum)
                        fix_metadata(CurrentAlbum + Song, CoAlbum + Song)
                        try:
                            # Try again after fix.
                            match_metadata(CurrentAlbum + Song, CoAlbum + Song)
                        except Exception as e:
                            # If it still doesn't work there's black magic in place
                            # go sleep, drink a beer and try again later.
                            print("""Ehm, something happened and your sync failed.\n
                                  Error:{}""".format(e))
                            raise SystemExit(0)
    
  2. In Topic: Suggestions on code improvement

    Posted 19 Aug 2015

    I've fixed the weird bug, it was merely a miswritten if statement on line 136, now it should look as follows if (".flac" in Song or ".FLAC" in Song):.
    On related news a new, more powerful buggy function has arisen. I've been unsatisfied with shutil.copy, it's slow for large files and FLAC music is a little overweight. After some research and changes of my own I've come with a slightly (roughly 10%) faster file transfer function that is as follows:
    # Faster file copying function, arguments go as follows: Source file location,
    # target directory, whether to keep the filename intact and whether to create
    # the target directory in case it doesn't exist.
    def copy_file(SrcFile, TgtDir, KeepName=True, MakeDir=True):
        SourceFile = None
        TargetFile = None
        KeepGoing = False
        # Checks is TgtDir is valid and creates if needed.
        if MakeDir and not os.path.isdir(TgtDir):
            os.makedirs(TgtDir)
        # Processes TgtDir depending on filename choice.
        if KeepName:
            TgtDir += os.path.basename(SrcFile)
            print(TgtDir)
        try:
            SourceFile = open(SrcFile, 'rb')
            TargetFile = open(TgtDir, 'wb')
            KeepGoing = True
            Count = 0
            while KeepGoing:
                # Read blocks of size 2**20 = 1048576
                Buffer = SourceFile.read(2 ** 20)
                if not Buffer:
                    break
                TargetFile.write(Buffer)
                Count += len(Buffer)
        finally:
            if TargetFile:
                TargetFile.close()
            if SourceFile:
                SourceFile.close()
        return KeepGoing
    

    As far as I've tested it's working just fine and I've had no issues with it whatsoever. Until I decided to write another copy_tree function that is proving to be hell on earth to me. I've written 2 versions none of which work at all. Even when I copies python's shutil.copytree source code I couldn't get the bloody thing to work. It does all kinds of nasty bugs, treats files as folders and makes a bunch of useless folders in the target directory, it makes folders inside folders recursively, it's just a mess and I have no idea where to even begin to fix it. Here go the different versions of the function.
    def copy_tree(SrcDir, TgtDir, Replace=True):
        Names = listdir(SrcDir)
        if KeepName:
            TgtDir = format_dir(TgtDir, os.path.basename(SrcDir))
        if not os.path.isdir(TgtDir):
            os.makedirs(TgtDir)
        errors = []
        for Name in Names:
            SrcName = os.path.join(SrcDir, Name)
            TgtName = os.path.join(TgtDir, Name)
            try:
                if os.path.isdir(SrcName):
                    copy_tree(SrcName, TgtName)
                else:
                    copy_file(SrcName, TgtName, KeepName=False)
            except (IOError, os.error) as why:
                errors.append((SrcName, TgtName, str(why)))
            # catch the Error from the recursive copytree so that we can
            # continue with other files
            except Exception as err:
                errors.extend(err.args[0])
        try:
            shutil.copystat(SrcDir, TgtDir)
        except WindowsError:
            # can't copy file access times on Windows
            pass
        except OSError as why:
            errors.extend((SrcDir, TgtDir, str(why)))
        if errors:
            raise Exception(errors)
    

    And
    def copy_tree(SrcDir, TgtDir, Replace=True):
        TgtDir = format_dir(TgtDir, os.path.basename(SrcDir.rstrip("/")))
        if not os.path.isdir(TgtDir):
            os.makedirs(TgtDir)
        Subs = listdir(SrcDir)
        Errors = []
        for Sub in Subs:
            print("Sub: {}".format(Sub))
            SrcName = format_dir(SrcDir, Sub).rstrip("/")
            print("SrcName: {}".format(SrcName))
            TgtName = format_dir(TgtDir, Sub).rstrip("/")
            print("TgtName: {}".format(TgtName))
            try:
                if os.path.isdir(SrcName):
                    print("It's a dir! {}".format(SrcName))
                    copy_tree(SrcName, TgtName)
                else:
                    print("It's a File!\nSrcName: {}\nTgtName: {}".format(SrcName, TgtName))
                    copy_file(SrcName, TgtName, KeepName=False)
            except (IOError, os.error) as why:
                Errors.append((SrcName, TgtName, str(why)))
            if Errors:
                raise Exception(Errors)
    

    For some reason I just can't get my head around correctly making a directory copying function. Also, yes, it would copy all files in that folder as well as all subfolders and subsequent files and so on.
  3. In Topic: Suggestions on code improvement

    Posted 18 Aug 2015

    View Postjon.kiparsky, on 19 August 2015 - 12:16 AM, said:

    View PostRaminator, on 16 August 2015 - 10:27 PM, said:

    I've also placed it on GitHub since it's turning quite usable now.



    I think this shows a misunderstanding of what github is for. You should have your code under version control from the first line that you write. It's not there so that I can find your code, it's there so that you don't lose your code.

    Partially I agree with you. I do use GitHub as a code sharing platform since, who knows, maybe someday someone will need a flac syncing python script for whatever reason and it's nice to have it easy to find in GitHub. I had however had the code in GitHub from the very first code version (in the original post), when I said "placed it on GitHub" I think I'd have expressed myself better with "Added a README and organized the GitHub repo". I'll pay more attention next time, thanks for the tip jon.kiparsky!
  4. In Topic: Suggestions on code improvement

    Posted 18 Aug 2015

    So, I've began to test the new code and this weird bug came up. Because I forgot to check whether the file is a .flac before running the metadata matching functions it obviously crashed once it hit a cove.jpg file. Easy to fix I went ahead and added a check before the try routine like this if (".flac" or ".FLAC" in Song):(Line 136). Weirdly enough, however this did not do any effect whatsoever, the try routine runs anyway and thus crashes. Why would this be happening?
    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    import shutil
    import os
    
    from mutagen.flac import FLAC  # Used for metadata handling.
    from os import listdir  # Used for general operations.
    from fuzzywuzzy import fuzz  # Last resource name association.
    # Insert here the root directory of your library and device respectively.
    lib = 'C:/Users/berna/Desktop/Lib/'
    dev = 'C:/Users/berna/Desktop/Dev/'
    
    
    # Checks for new and deleted folders and returns their name.
    def check_folder(SrcDir, TgtDir):
        # Lists Source and Target folder.
        Source = listdir(SrcDir)
        Target = listdir(TgtDir)
        # Then creates a list of deprecated and new directories.
        Deleted = [FileName for FileName in Target if FileName not in Source]
        Added = [FileName for FileName in Source if FileName not in Target]
        # Returns both lists.
        return (Added, Deleted)
    
    
    # Checks for song in case there's a name mismatch or missing file.
    def check_song(SrcFile, TgtDir):
        Matches = []
        # Invariably the new name will be that of the source file, the issue here
        # is finding which song is the correct one.
        NewName = TgtDir + '/' + os.path.basename(SrcFile)
        TagSource = FLAC(SrcFile)
        # Grabs the number of samples in the original file.
        SourceSamples = TagSource.info.total_samples
        # Checks if any song has a matching sample number and if true appends the
        # song's filename to Matches[]
        for Song in listdir(TgtDir):
            SongInfo = FLAC(TgtDir + '/' + Song)
            if (SongInfo.info.total_samples == SourceSamples):
                Matches.append(Song)
        # If two songs have the same sample rate (44100Hz for CDs) and the same
        # length it matches them to the source by filename similarity.
        if (Matches.count > 1):
            Diffs = []
            for Song in Matches:
                Diffs.append(fuzz.ratio(Song, os.path.basename(SrcFile)))
            if (max(Diffs) > 0.8):
                BestMatch = TgtDir + '/' + Matches[Diffs.index(max(Diffs))]
                os.rename(BestMatch, NewName)
            else:
                shutil.copy(SrcFile, TgtDir)
        # If there's no match at all simply copy over the missing file.
        elif (Matches.count == 0):
            shutil.copy(SrcFile, TgtDir)
        # If a single match is found the filename will be the first item on the
        # Matches[] list.
        else:
            os.rename(TgtDir + '/' + Matches[0], NewName)
    
    
    # Syncs folders in a directory and return the change count.
    def sync(SrcDir, TgtDir):
        AddCount = 0
        DeleteCount = 0
        # Grabs the folders to be added and deleted.
        NewDir, OldDir = check_folder(SrcDir, TgtDir)
        # Checks if any and then does add/rm.
        if OldDir:
            for Folder in OldDir:
                shutil.rmtree(TgtDir + Folder)
                DeleteCount += 1
        if NewDir:
            for Folder in NewDir:
                shutil.copytree(SrcDir + Folder, TgtDir + Folder)
                AddCount += 1
        return(AddCount, DeleteCount)
    
    
    # Fixes missing metadata fields.
    def fix_metadata(SrcFile, TgtFile):
        TagSource = FLAC(TgtFile)
        TagTarget = FLAC(SrcFile)
        # Checks for deleted tags on source file and deletes them from target.
        if (set(TagTarget) - set(TagSource)):
            OldTags = list(set(TagTarget) - set(TagSource))
            for Tag in OldTags:
                # TODO Right now I haven't quite figured out how to delete
                # specific tags, so workaround is to delete them all.
                TagTarget.delete()
        # Checks for new tags on source file and transfers them to target.
        if (set(TagSource) != set(TagTarget)):
            NewTags = list(set(TagSource) - set(TagTarget))
            for Tag in NewTags:
                TagTarget["%s" % Tag] = TagSource[Tag]
                TagTarget.save(TgtFile)
    
    
    # Does metadata transfer between two files.
    def match_metadata(SrcFile, TgtFile):
        Altered = 0
        TagSource = FLAC(SrcFile)
        TagTarget = FLAC(TgtFile)
        # For every different Tag in source song copy it to target and save.
        for Tag in TagSource:
            if TagSource[Tag] != TagTarget[Tag]:
                Altered += 1
                TagTarget[Tag] = TagSource[Tag]
                TagTarget.save(TgtFile)
        return(Altered)
    
    
    # Simply does directory formatting to make things easier.
    def make_dir(Main, Second, Third=""):
        # Replaces \ with /
        Main = Main.replace('\\', '/')
        # Adds a / to the end of Main and concatenates Main and Second.
        if(Main[len(Main) - 1] != '/'):
            Main += '/'
        Main += Second + '/'
        # Concatenates Main and Third if necessary.
        if (Third):
            Main += Third + '/'
        return (Main)
    
    # Sync main folders in lib with dev.
    sync(lib, dev)
    # For every Artist in lib sync it's Albums
    for Artist in listdir(lib):
        sync(make_dir(lib, Artist), make_dir(dev, Artist))
        # For every Album in Artist match songs
        for Album in listdir(make_dir(lib, Artist)):
            # Declares lib Album and dev Album to make function calls shorter.
            CurrentAlbum = make_dir(lib, Artist, Album)
            CoAlbum = make_dir(dev, Artist, Album)
            for Song in listdir(CurrentAlbum):
                if (".flac" or ".FLAC" in Song):
                    try:
                        # Tries to match lib and dev song's metadata.
                        match_metadata(CurrentAlbum + Song, CoAlbum + Song)
                    except:
                        # If that fails will try to fix both Filename and Tag
                        # fields.
                        check_song(CurrentAlbum + Song, CoAlbum)
                        fix_metadata(CurrentAlbum + Song, CoAlbum + Song)
                        try:
                            # Try again after fix.
                            match_metadata(CurrentAlbum + Song, CoAlbum + Song)
                        except Exception as e:
                            # If it still doesn't work there's black magic in place
                            # go sleep, drink a beer and try again later.
                            print("""Ehm, something happened and your sync failed.\n
                                  Error:{}""".format(e))
                            raise SystemExit(0)
    
    
  5. In Topic: Suggestions on code improvement

    Posted 17 Aug 2015

    View PostDK3250, on 17 August 2015 - 04:06 PM, said:

    Maybe you misunderstand me. I am looking to the difference between
    if a-b:
    
    and
    if a != b:
    
    In both situations you get the condition True only if 'a' is different from 'b'. I think the second way of writing is best, and wonder why it shouldn't be applied to your line 84 (latest version of your code).

    Yes, indeed I had misunderstood you, my bad. I think I had tested this before, the thing here is that when comparing set()s a smaller set subtracted by a larger one will yield an empty input. You can easy test this by opening your Python shell and writing the following code:
    a=['a','b','c','d']
    b=['a','b','c']
    print(set(a)-set(b )) #Should yield ['d']
    if(set(b )-set(a)):
        print("Not Empty: {}".format(set(b )-set(a)))
    
    

    If you try to run this the print inside the if wont run in that example, meaning that it is only supposed to happen if a[] has more elements than b[]. With set(a) != set(b ) I want it to run whenever b[] has more elements than a[] and in this case we'll pass them over. In this sense I think you are right, would a more proper implementation be around these line?
    def fix_metadata(SrcFile, TgtFile):
        TagSource = FLAC(TgtFile)
        TagTarget = FLAC(SrcFile)
        # Checks for deleted tags on source file and deletes them from target.
        if (set(TagTarget) - set(TagSource)):
            OldTags = list(set(TagTarget) - set(TagSource))
            for Tag in OldTags:
                # TODO Right now I haven't quite figured out how to delete
                # specific tags, so workaround is to delete them all.
                TagTarget.delete()
        # Checks for new tags on source file and transfers them to target.
        if (set(TagSource) - set(TagTarget)):
            NewTags = list(set(TagSource) - set(TagTarget))
            for Tag in NewTags:
                TagTarget["%s" % Tag] = TagSource[Tag]
                TagTarget.save(TgtFile)
    

My Information

Member Title:
D.I.C Regular
Age:
Age Unknown
Birthday:
April 30
Gender:
Location:
Brasil
Years Programming:
3
Programming Languages:
Arduino, Python, C, VB.NET

Contact Information

E-mail:
Private

Friends

Comments

Raminator has no profile comments yet. Why not say hello?